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Preface 



Pierce Butler planted the seed of this work some 
thirty years ago when he attempted to persuade me to 
follow my translation of Georg Schneider's "Theory and 
History of Bibliography n with an English rendering of 
Milkau's "Handbuch der Bibliothekswissenschaft. TT As 
we discussed this in his cubbyhole at the University of 
Chicago it became increasingly apparent that for all the 
excellence and usefulness of the Milkau, a version in 
English would require such wide revision and additions 
as to constitute a completely new work, and the project 
was, I thought, abandoned. Over the years, however, 
it remained a subject of, perhaps sometimes sub- con- 
scious, dreaming, thinking, and occasionally some work. 

When the Council on Library Resources was organ- 
ized its need for information on the state of the library 
art resulted in almost spontaneous germination of Pierce 
Butler T s long dormant seed. This resulted in a grant 
to the Graduate School of Library Service at Rutgers 
for preparation, under the direction of the undersigned, 
of a review of the status of our current knowledge of 
librarianship. From the inception of this project it was 
recognized that the initial grant could cover only some- 
what less than half of the vast field, and there are many 
areas of librarianship still to be covered. 

An advisory committee helped in the design of the 
program and in determining priorities for treatment of 
the various aspects of the field. 

The advisory committee included: Dr. Julian H. 
Bigelow, Institute for Advanced Study at Princeton, New 
Jersey, Mr. Verner W. Clapp, President of the Council 
on Library Resources, Mr. Donald Coney, Librarian of 
the University of California, Mr. J. W. Kuipers, Appa- 
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ratus research and development in Itek Corporation, 
Dr. Robert D. Leigh, Dean of the School of Library 
Service at Columbia University, Dr. Lowell A. Martin, 
Dean of the Graduate School of Library Service at 
Rutgers, Professor A. J. Riker of the College of Agri- 
culture at the University of Wisconsin, Dr. Melville J. 
Ruggles, Vice President of the Council on Library Re- 
sources, Mr. A. N. Sears, Vice President of Reming- 
ton Rand, and Dr. Eugene H. Wilson, Vice President 
of the University of Colorado. 

Since the volumes now completed and being prepared 
for publication cover perhaps forty percent of the entire 
range of librarianship and bibliography, the plan of pub- 
lication is in the familiar "Handbuch" form so as to 
permit the addition of volumes as financial resources 
and dedicated writers become available. 

In preparing these for publication it appeared best 
to permit some variations in style from one volume to 
another rather than to devote a large percentage of our 
resources to achieving a standard style for all. 

The general pattern followed in these studies con- 
sists of a survey of the published and unpublished liter- 
ature of each facet of the field. In this survey, as a 
first step, each compiler attempts to summarize what 
the literature says with a minimum of redundancy but 
without editorial comment. Each statement is accom- 
panied by a footnote so that investigation in depth can 
be conducted when necessary, but for most purposes, 
if we have adequately performed our primary task, it 
should be unnecessary to search the literature for infor- 
mation on the topics covered, A second step, provided in 
most cases, is examination of the evidence provided to 
support each allegation or statement in the literature and 
an indication of whether that particular bit of "the art" is 
empirical or the extent and reliability of the objective data 
provided to support it. 

This pattern of presentation is modified in a number 
of the parts of the series since, except for the sake of ex- 
ternal uniformity, it would serve little purpose to repeat 
substantially every statement in this second part followed 
by the words, "no objective evidence. ff 
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It was initially planned to present each summary in 
the historical perspective of the development of its field 
but this method of treatment was found unsuitable for 
some of the subjects. In some cases, such as the 
study of "Reading Devices for Micro-Images, Tt both his- 
torical and topical treatments are presented; in others 
the treatment is historical only, and in still others pri- 
mary emphasis is given to topical treatment. Here as 
in the review of evidence, suitability for each topic is 
given priority over external uniformity of the set. 

While a few of these reports were prepared by staff 
assistants, as indicated by the list of titles and research 
staff, we were fortunate in enlisting participants of out- 
standing authority and reputation in most of the subject 
areas treated. In most cases the resultant first draft 
was read by one or more additional specialists in the 
field and in this work we had the assistance of such 
well known authorities as: Hubbard W. Ballou, Ralph 
H. Carruthers, Ralph Esterquest, Robert A. Fairthorne, 
John Fall, Charles F. Gosnell, Lutz Helbig, Laurence 
Kipp, Alfred H. Lane, Chester Lewis, Calvin N. Mooers, 
Robert H. Muller, Maurice F. Tauber, Lawrence S. 
Thompson, and others. 

This is the concerted work of many hands, and 
many more than those listed above have helped in almost 
countless ways. Such value as may reside in this ser- 
ies is the result of professional contributions of high 
order by many people. Final responsibility for the plan 
of work, selection and supervision of research staff, re- 
vision of manuscripts as well as editing them and pro- 
duction of the volumes rested solely on the undersigned 
and he accepts full responsibility for the imperfections 
to be found in this series. It is hoped that this review 
of the library art will, on balance, be found to make a 
useful contribution and that it may some time be carried 
forward to cover the whole field of librarianship. 

Ralph R. Shaw 
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Volume Four Part One 
NOTCHED CARDS 

by 
Felix Reichmann 



The function of a library is not confined to the ac- 
quisitions and preservation of material. Though many 
librarians have bibliophilic interests and are intent on 
enriching their collections, yet the ultimate goal of the 
library is use. The modern conception of technical 
services in libraries regards acquisition as an integral 
part of the preparation process and considers its task 
fulfilled only when the book has passed the three stages: 
selection, acquisition, cataloging, and has been placed 
at the disposal of the reader. 

Much thought has gone into the problem of how to 
organize library material for use, and different possibil- 
ities have been carefully examined. The conventional 
dictionary catalog, its division into an author and a sub- 
ject file, the classified arrangement of catalog cards, 
the formation of smaller bibliographical files destined 
for a selected clientele (technical reports in special li- 
braries, imprint files in Rare Book Departments, etc. ) 
and finally the physical arrangement on the shelves ac- 
cording to a subject classification or by current num- 
ber all have been described and their specific advan- 
tages recognized 

All these bibliographical techniques have certain 
properties in common: 1. The method of organization 
(the input of information) determines the retrieval pos- 
sibilities. For example, an author catalog can only 
tell which titles of a given author are available, and an 
imprint file will provide only information as to place 
and publisher or as to date of publication. 2. The 
nature of the information to be given must be envisaged 
from the outset; no provision can be made for new cat- 
egories. 3. An excessive number of cards would be 
necessary to bring out all possible combinations of sub- 
ject aspects, e. g. a title with five descriptors would 
need 120 cards (5! =120). Cross-references are only a 
partial remedy for this shortcoming. 4. The cards 
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must be kept in a specific sequence. 5 It is extreme- 
ly difficult to substantiate a negative answer, which in 
scholarly research is often as important as a positive 
one. 

The complexity of modern research and the aware- 
ness of the multitude of subject relations which may be 
significant in future scholarly endeavours impel librar- 
ianship to search for new methods of bibliographical 
control which would augment the conventional techniques 
and permit a greater flexibility in the retrieval of infor- 
mation (1). A simple practice to increase the efficiency 
of a cardfile without adding a new subject heading or 
being forced to inspect every single card, is the use of 
little colored flags or notches punched out from the up- 
per edge of the card to signal specific information. The 
latter device (triangular notches made with the Copeland- 
Chatterson single hole punch) is employed at the class- 
ified catalog of the Institute of Cancer Research (London) 
to identify titles which have TT a considerable and easily 
usable bibliography" (2). In this instance punches are 
auxiliary to conventional cards, the punched cards re- 
verse the roles and assign the primary function of infor- 
mation retrieval to holes and notches (3). 

Punched cards as memory- storage of patterns were 
employed to control the operations of the loom invented 
by Joseph Marie Jacquard in 1801, Herman Hollerith 
adapted the basic idea for the construction of a tabulat- 
ing machine for the U. S. Census Bureau in 1886. The 
first patent for a marginal punched card was granted to 
H. P. Stamford (1896, No. 564, 117), subsequent patents 
were granted to W. M. Stretch (1907, No 867, 618), and 
E. C. Molina (1914, No. 1,083,456). Independent of 
these still rather primitive inventions a notched card was 
used to record information gathered at a hookworm sur- 
vey in Brazil in 1920. The decisive step to produce a 
generally applicable punched card was taken by Alfred 
Perkins. Mr, Perkins resorted to marginal punches to 
sort tickets used by the Dunlop Rubber Company in 
Birmingham. The printer of these tickets, Copeland- 
Chatterson, obtained the inventor 7 s permission to apply 
for an English patent on a royalty basis and became the 
pioneer in the commercial distribution of marginal 



Retrieval Systems 13 

punched cards. 

Mr. Perjdns received an American patent in 1925 
(U. S. Patent 1, 544, 172) and sold the U. S. rights to the 
McBee Corporation in 1932. In subsequent years scores 
of patents both in this country and in Europe have been 
granted to improve upon or to vary the Perkins inven- 
tion (3a). 

The characteristic features of the marginal punched 
card are: 1. The cards have one row (or multiple 
rows) of pre-perforated holes around the edges. 2. The 
center of the card remains free for conventional media 
of information (typing, design, microfilm, etc. ). 3. All 
retrieval aspects of one title are recorded on one card, 
thus eliminating the necessity of multiplying cards. 
4. Combinations of aspects, although not perceived 
when the file was set up, can be searched for. One 
may generalize that punched cards reduce the conven- 
tional catalog in space but extend its usefulness in time. 
The cards can show both logical sums (a+b, also called 
alternation as it also signifies a or b), and logical prod- 
ucts (ab). 5. Cards can be filed at random. 

The versatility of the card demands great care and 
exactness in the formulation of the information to be 
transmitted. The literature of the field reiterates the 
stern advice: "No device, simple or complicated, can 
compensate for thoughtlessness in the analysis of infor- 
mation or for sloppiness in the use of terminology" (4). 

In art history it has been observed that the intro- 
duction of new material necessitated the adaptation of 
new techniques and thus created a new style (5), sim- 
ilarly new bibliographical techniques often work more 
efficiently if they are based on new semantic formula- 
tions which have been constructed in keeping with the 
characteristics of the new tool. 

This statement, however, is not universally accept- 
ed. Many American designers of punched- card equip- 
ment suggest to their customers the use of Tt tailor made" 
descriptors often sharply deviating from library termin- 
ology, but Bloomfield cautions "Let T s not reject conven- 
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tional subject-headings" (6). Haykin advocates the con- 
struction of "classifications chiefly designed for infor- 
mation retrieval" (7) and Stroem believes that multi- 
dimensional schemes may be preferable to conventional 
ones (8). Conversely, many European librarians write 
most favorably of their experiences with the Universal 
Decimal Classification. Ruston considers it "Easily 
applicable, ...^however, only figures can be used; all 
other symbols have to be eliminated" (9). Grobe reports 
"Excellent results" (10) and Weigelin praises the 
punched- card file of the eye clinic in Bonn which has 
applied the UDC to integrate the material from different 
eye hospitals (11). Finally the International Federation 
for Documentation fairly endorses the use of this sys- 
tem in the conclusion of the chapter on Decimal Class- 
ification (12). "The UDC is a suitable classification 
basis for mechanical selection in documentation, without 
any restriction for edge punched cards, but with some 
restrictions for surface punched cards. It is disting- 
uished for its encyclopedic structure, its numerical 
basis, its international character and above all, for the 
test of practical use which it has survived. All bases 
for coding of content of articles are available. Future 
experiences with mechanical aids can easily be worked 
in with the UDC. " 

Punched cards transmit information through holes 
and notches and not through letters and numerals. The 
translation of the conventional medium into the language 
of the punched card is called coding. One of two things 
can be done to the prepunched hole of a marginal 
punched card; it can be left intact, or it can be enlarged 
to reach either the edge of the card or the next hole. 
We have thus a binary code where every digit (letter, 
word) is either or one. It is by no means a new in- 
vention to convey information through a two- state code. 
Cryptography has used it often, Francis Bacon employed 
it in his biliteral code and the dots and dashes of the 
Morse Code are well known (13). 

It if important to remember that the hole, if left 
untouched, does not represent any value, although the 
explaining label is printed alongside. It becomes mean- 
ingful when it is notched and the hole thus extends to 
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the outer edge. If the sorting needle is introduced into 
the hole, the notched cards will fall from the needle, 
these being the cards which have been selected. Two 
additional operations are feasible with cards which have 
a double row of holes. The hole in the inner row can 
be connected with a notched hole and thus reach the 
edge (deep punch), or two holes can be connected with- 
out reaching the edge, (intermediate punch). In the 
latter case the card will not fall from the needle but 
only drop 1/4 of an inch, A second needle has to be 
introduced into a guide hole and the first withdrawn to 
have the card drop out completely. A third variation 
is theoretically possible. If two holes in the inner 
row can be connected horizontally, the card would not 
drop but slide sideways (14). Cards with three rows 
add other possibilities of punching. Cards with multi- 
ple rows (more than three) are rarely used. 

The single hole which corresponds to a digit (letter, 
word, etc.) is called a code position. A number of 
code positions, generally four to six, all pertaining to 
a single subject or classification, can be combined into 
a code field. Two or more code fields related to the 
same subject can be joined in a code section. 

The simplest method of coding which assigns only 
one meaning to every hole is called direct coding. The 
question which has to be answered through the medium 
of one sorting needle is a simple one: yes or no. The 
hole left intact is "no", the notch is "yes" (15), 

The capacity of the edge punched cards for informa- 
tion storage is very great indeed. A card with a 
single row of twenty- six holes permits a variety of com- 
binations equal to two to the power of twenty- six. Every 
hole permits one of two operations and as the card has 
twenty- six holes approximately seventy million combina- 
tions are possible, in the case of fifty-two holes the 
possibilities of variations would be four thousand billion. 
That means that in a library of seventy million titles, 
each title could be identified with direct coding by using 
a punched card file with twenty- six holes to the card. 
The characterization, however, would consist in a 
variety of twenty- six conceptions. Obviously this is a 
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purely academic proposition. More important from the 
practical point of view is the serious disadvantage that 
the sum total of aspects to be coded cannot be higher 
than the number of holes on a single card. This meth- 
od, however, is the simplest coding technique and the 
easiest way to retrieve information. Furthermore, if 
the questions are formulated in keeping with the coding 
structure, there can be no unwanted cards. It is not 
always the fastest system; if the digits 0-9 are coded 
directly nine operations with the needle will be neces- 
sary to sort a pack of cards numerically from zero to 
nine and every card would be handled on the average of 
5 1/2 times (16). 

Slightly more sophisticated is the numerical sequence 
code generally represented by the code field: seven~ 
four, two, one. This field covers all the digits from 
zero to nine. For zero the field is left intact, for three 
both two and one are notched, for five both four and one 
are notched, and so on. Only four needling operations 
are necessary and the number of cards to be needled 
would be about four times the size of the pack. As the 
notches are used in combinations, (four and one have to 
be combined to give five) this method is also called 
combination code. It is also possible to code the digits 
0-14 in a four hole field. A code section can be ar- 
ranged where the first field covers the units, the second 
the tens, the third the hundreds and so forth. It is in- 
teresting to note that the numerical sequence runs from 
right to left. 

By adding a fifth hole to the field this method can 
be adapted to serve as a simple alphabetical code. In 
this case the holes represent the letters of the alphabet 
instead of digits, A one, B = two, C = two plus one and 
so on for the first thirteen letters, a-m. To identify 
the second thirteen letters, n-z, the fifth hole marked 
n-z, is notched with N= one, O =two, P= two plus one, 
etc. Only one number (or one letter) can be coded in 
any one field. One needle is being used and as the 
single hole has a combination of meanings, an unequiv- 
ocal selection of a card is not possible, however it is 
a most convenient method to order the cards in a se- 
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quence. 

A better alphabetical code has been made by Cox, 
Bailey and Casey (17). A code field of five positions 
is marked with the letters O I E C B. All letters of 
the alphabet can be coded, most of them in combina- 
tions of two to four holes. Five letters need one posi- 
tion, ten letters require two, nine letters three, and 
three letters four positions. tT A !f is identified by leav- 
ing the field intact. There are twenty- eight symbols, 
because M is subdivided into three characters. This 
code too is applicable for sequence sorting only. 

If all cards of a certain group must be selected 
without having too many unwanted cards (the so-called 
noise-effect) a selector code has to be employed using 
two needles in a field of six holes. The most common 
arrangement is SF seven, four, two, one, zero. SF 
(single figure) is notched together with seven if this 
digit is to be selected, for the selection of eight, holes 
seven and one are notched; however for coding zero, 
this hole alone is notched. 

This method is also necessary if the number of 
conceptions to be coded exceeds the number of holes. 
A six-hole field, using two positions to code one aspect, 
would give the choice for one out of fifteen possible 
conceptions: if three holes are combined to code one 
aspect, the choice increases to twenty. The mathemat- 
ical rule is that the maximum number of aspects is 
given if the number of positions used for one code is 
half the number of holes in one field. However, the 
number of possible choices has to be weighed against 
the ease of retrieval. A code with three choices needs 
three needles where a two-position code needs only two. 
Only one aspect can be coded in any one field. 

In the triangle code (field of five holes) the ten 
symbols (0-9) are not printed parallel to the holes but 
arranged in a triangle or pyramid. The two holes are 
notched whose diagonal columns intersect at the digit 
or letter which has been selected. All the symbols can 
be used directly from the card without remembering the 
combinations necessary to form the letters or digits; 
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moreover the field has been reduced from six to five 
holes. 

If the overall number of subjects to be coded is 
very large and more than one aspect should be recorded 
in any one field, superimposed coding has to be em- 
ployed. Each concept is coded by notching two or more 
holes, completely disregarding the fact that some of the 
notches may have been used already in another combin- 
ation to identify another descriptor. A certain n noise 
effect 71 (unwanted cards) is therefore unavoidable, how- 
ever the number of wrongly selected cards will be kept 
to a minimum if the combinations are selected at ran- 
dom. The reasons for this efficient result are the sig- 
nificant statistical possibilities of random selection. It 
has been calculated that optimum results will be ob- 
tained if forty-six per cent of the positions in any given 
field are being used. Because of the overlapping code, 
actually only thirty- seven per cent of the holes available 
will be notched. Translated into practical operations 
this means that twelve letters can be coded in an alpha- 
betical field of twenty- six positions. 

Calvin N. Mooers has made important contributions 
to the mathematical theory of coding based on random 
selections (18). His method called Zatocoding, uses a 
card with forty holes, not broken up in sub-fields. Six- 
ty-nine per cent of the positions can be utilized; because 
of overlapping, however, only fifty per cent are actually 
punched. The single code is based on a four-position 
pattern. Therefore, in a card with forty holes, seven 

descriptors can be recorded (40 x 0. 69 = 7). A larger 

| 

card with seventy-two positions can identify twelve con- 
ceptions. 

Gilbert proposed a modification of the Zatocoding 
principle which he called orthographic single-field super- 
imposed code. Instead of random numbers, pairs of 
letters taken from the spelling of the aspects constitute 
the code. A card with fifty-five positions is recom- 
mended, which can take care of 676 possible English 
letter pairs, (26 2 = 676); twelve to thirteen descriptors 
are permissible per card (19). 
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Regardless of the coding system which has been 
adapted, the coding capacity of the card increases in 
proportion to the number of holes available. It would 
be impracticable to enlarge the size of the card beyond 
a certain point; therefore card manufacturers resort to 
a double row (or multiple rows) of pre-punched holes. 
Fairly common is a double row arrangement for the nu- 
merical code seven, four, two, one. A n deep punch" 
enlarging the hole in the second row up to the edge of 
the card identifies the selection of a single digit (e. g. : 
seven), the holes in the outer row will be notched if a 
combination of holes is necessary (seven and one for 
eight) the same as in a single-row arrangement. The 
triangular code too is often employed with a double row 
of holes (20). 

An interesting combination of a double and triple 
row has been worked out by the Oak Ridge Laboratory 
of the Atomic Energy Commission (21). The card 
(E-Z Sort) has 366 positions arranged in triple rows at 
top and bottom and double rows at the two sides. The 
top edge has two alphabet fields of twenty-three columns 
each in triple rows designated for subject indexing. The 
bottom edge has five fields in triple rows. One alpha- 
bet field in twenty-three columns, three fields with six 
columns for author T s name and one with three columns 
(for bibliographical information); each side has six 
fields, four columns in double rows each. These fields 
are double printed, either in the numerical code seven, 
four, two, one, or as a code section with a twenty- 
four-letter alphabet. They will identify the classifica- 
tion system or serve "any other use desired". The 
coding system is direct coding; subjects are coded by 
the first four letters. E-Z Sort offers a thirty position 
code field arranged in triple rows and marked with the 
digits 9-0. All digits from 000-999 can be identified 
with direct coding and selected with three needles. The 
outer row is notched for the units, the second row is 
deep-punched for the tens and the third row designates 
the hundreds. Three positions are needed to code one 
digit and three separate control holes identify the rep- 
etition of a digit in a given number. The control holes 
are marked H for repetition in the hundreds (883) and T 
for duplication in the tens (833). 
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A multiple-row arrangement provides a very great 
storage capacity. Hardy has calculated the possibility 
of transferring the London Telephone Directory to a set 
of pre-punched cards. Both the subscriber's name and 
his telephone number consisting of three letters and four 
digits could be recorded on a triple-perforated card. 
He concludes that T7 a very fair telephone directory on 
punched cards could be assembled which could be used 
for finding names from numbers or numbers from 
names rt (22). 

Crosz contributed an interesting probability calcula- 
tion for a punched card with multiple- column arrange- 
ment. A sub-field with twenty-four positions arranged 
in three columns with eight rows or eight columns in 
three rows would have a storage capacity of 2024 as- 
pects (23). 

From the point of view of information retrieval one 
has to keep in mind that the needling time increases 
with the growth of information storage. Furthermore, 
the combination of different methods of punching (deep 
punch, intermediate, shallow) make the retrieval time- 
consuming, cumbersome and difficult. 

It is impossible to give any overall preference to 
any one of the four major methods of coding as every 
system has "its own peculiar advantages and disadvant- 
ages 11 (24). If the number of positions available is 
equal to or greater than the sum of the aspects to be 
coded, direct coding is the simplest and most efficient 
technique. If the number of conceptions which are 
needed increases beyond the number of holes, one has 
to resort to a combination of notches as provided by 
the sequence and selector codes. These methods are 
restricted to the selection of one aspect per field and 
to mutually exclusive aspects to be assigned to different 
fields or sections. Superimposed coding provides for a 
large number of concepts which are not mutually exclus- 
ive. Information retrieval, however, becomes more 
time consuming when the coding methods increase in 
complexity. 

In addition to edge-punched cards two other forms 
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of cards can be sorted manually and must be briefly 
mentioned in this review. The slotted cards have many 
features in common with the cards which have been des- 
cribed above (25). They are also called, punched cards 
with central punching (cartes a pre-perforations cen- 
trales, SchlitzlochKarten, or Feldlochkarten. ) They are 
coded by punching out the space between two perfora- 
tions and thus producing a slot, either horizontal or 
vertical, similar to the "intermediate punch. !T Most 
such cards have between two hundred and three hundred 
holes and are coded in the direct or combination meth- 
od. Sorting operations often require multiple needles 
and as freehand needling would be cumbersome, if pos- 
sible at all, most systems use especially designed but 
comparatively simple mechanical contrivances for infor- 
mation retrieval. 

The American Findex card reserves the upper part 
of the card for typed or printed information. About 
seventy per cent of the card space is pre-punched with 
round holes which are arranged in ten to thirteen col- 
umns, every column having ten to fourteen perforations. 
Two holes slotted in vertical direction form one code 
position. The positions are numbered from t}ie bottom 
left corner. A typical Findex card has hundred coding 
positions arranged in ten columns with eleven perfora- 
tions each. Both direct coding and a combination code 
can be employed; a two positions code (using two need- 
les) needs ten positions for forty-five aspects, a three 
positions code can place one hundred and twenty aspects 
on ten positions. Retrieval is performed with the help 
of a selector. 

The German card produced by Aliform provides 
slightly more than the upper half, of the card for con- 
ventional transcription of information. Coding is per- 
formed by vertical slots within thirty-four columns of 
ten round holes each. The cards are also adaptable 
both to sequence coding and superimposed coding. 

Three French systems have horizontal slots and de- 
part more radically from the features of the punched 
cards which have been reviewed so far. 



22 State of the Library Art 

Selectri leaves the left portion of the card blank 
for conventional recording. The coding field consists 
of eighteen to twenty-two columns each with twelve 
round perforations. 

Detectri is similar but has a coding field of seven 
to thirty-eight rows of rectangular perforations, gener- 
ally ten to the column. There is ample space on the 
left part of the card for uncoded information. 

The characteristic features of the Dequeker cards 
are their formats and the Selector. The lower part of 
the card which contains the coding field is narrower 
than the upper part left for conventional recording. The 
coding position is, as in the two other French systems, 
a horizontal slot, in this instance made between twenty- 
five columns, each having five round perforations. The 
Selector is especially interesting. The cards are not 
in direct contact with one another but separated by 
"spacers. M After the needles have been inserted, the 
Selector with the spacers moves sideways to the left. 
The spacers move the cards which have been slotted 
and the selected cards can be consulted in the Selector 
and need not be removed. 

The second system, Punched Cards with Visual 
Punching, is based on an entirely different principle 
(26). Instead of recording each title with its aspects 
on one card and selecting the titles desired by needling 
the file of T 'title cards, TT this system does exactly the 
opposite. It allots to every subject one card with a 
pre-printed grid and punches small holes in the positions 
which correspond to the serial numbers of the titles per- 
taining to the given subject. Information retrieval is 
made by superimposing a number of subject cards; the 
holes which have been punched in all the subject cards 
give the serial numbers of the title which represent the 
desired combination of aspects. This system can thus 
be called a subject-card system in contrast to the usual 
title or document card system. 

Holmstroin prefers the designation T 'superimposed 
coincidentally punched cards" (27). Wildhack and Stern 
(National Bureau of Standards) speak of Tt Optical coinci- 
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dence subject cards" (28). In America it is best known 
as Peek-a-Boo, in Europe as the Batten-Cordonnier sys- 
tem. (The official nomenclature in France is fiches 
super posables; in Germany, Sichtlochkarten). 

The first systematic use of the subject card system 
was made by H. Taylor in 1915, who applied it to the 
identification of birds (29). Five years later H. Soper 
received an American patent for the use of subject 
cards in compiling statistical data, and in the same 
year (1920) C. J. Gray described its application for the 
identification of minerals in the Transactions of the 
Geological Society of South Africa. A French patent for 
searching personnel files by this method was granted to 
Bourgeaud and Liber in 1923. The idea was further 
developed by J. Cordonnier and formed the basis of the 
most important French aspect cards. The first applica- 
tion to literature searching was made by the German 
librarian R. Preddek in 1930. Preddek's serial num- 
bers referred to his collection of metal plates (Adrema) 
from which he printed the titles of the appropriate pa- 
pers. The best known English example of this method 
is W. E. Batten's control of patent files (30). 

The capacity of the majority of available cards 
ranges between 500 and 20, 000 serial numbers. Batten 
uses an IBM card with 960 positions, the French Sphinxo 
card has 1000 and Cordonnier' s Selecto offers a variety 
of cards with up to 20, 000 positions. The German man- 
ufacturers, Ekaha and Aliform have models with a ca- 
pacity varying from 1860 to 7000 numbers. The Dutch 
"Delta card" has a lozenge pattern placing 10, 000 num- 
bers on the intersection of the lines (31). The Terma- 
trex card distributed by Jonker Business Machines has 
10, 000 and 40, 000 positions respectively. 

The Office of Basic Instrumentation in the National 
Bureau of Standards is developing the most far-reaching 
application of this method (32). Its cards have no pre- 
printed grid and a transparent overlay has to be used 
with a millimeter grid which permits the punching of 
up to 18, 000 positions. The card stock is of venylit 
plastic which so far has met all the requirements of 
"durability, dimensional stability, opacity, suitability 
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for typing and punchability. TT Punching errors can be 
easily mended: a piece of paper card stock is punched 
into the hole of the plastic card. "No cementing appears 
necessary; the fibrous insert spreads sufficiently to look 
itself firmly in place tT (33). 

The method has several great advantages. By in- 
serting a new aspect card, a new subject can be added 
with ease as there is no code which has to be re-ar- 
ranged. It is only necessary to note on the subject card 
that it had been introduced from a certain serial num- 
ber on. The system will produce the desired numbers 
very fast, because only a small number of subject cards 
has to be superimposed instead of searching in the en- 
tire file of title cards. It is necessary, however, to 
add a second step and to locate the selected titles in 
their numerical arrangement. 

The system works well in small and middle sized 
collections, no report on the efficiency of its control 
over large holdings has been available so far. Gagarin 
uses it in the Gmelin Institute and praises "the simplic- 
ity and elasticity of the system" (34). The Office of 
Basic Instrumentation uses this method as its principal 
means for searching, and expresses satisfaction with 
the results. The files contain 1000 subject cards and 
control about 25, 000 titles. The standard production 
per day is about 2000 punches (about 200 documents). 
This, of course, does not include the much more ex- 
pensive time of the document analyst. 

Like most of the hand- manipulated systems, Peek-a- 
Boo is at its very best if applied to a small collection. 
It could very well be an almost ideal method of control- 
ling a private collection of reprints. Roths chuh, for 
instance, reports most favorably on its usefulness in or- 
ganizing a collection of over 4000 reprints in the field 
of physiology with 350 aspect cards (35). 

The expansibility of the system is considerable: 
Kistermann is working on a combination of marginal 
punched cards and Peek-a-Boo, and Stern 1 s Microcite 
will combine abstracts and aspect cards (36). This de- 
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vice, which is being explored by the Office of Basic In- 
strumentation, consists of a film matrix of greatly re- 
duced (30:1) abstracts. The place of the abstract on 
the film coincides with the position of the corresponding 
serial number on the cards. With the help of a light 
diffuser and a microscope the abstract can be read or 
can be magnified and projected on a screen. There is 
the further possibility of printing the abstract on a sen- 
sitized card. At the moment a matrix is used which 
contains 1000 microabstracts. The Office is also con- 
sidering the possibility of a subject card 42" x 22" 
with a capacity for 500, 000 positions. 

The same principle (aspect card and no title card) 
forms the basis of the Uniterm system or Coordinate 
Indexing developed by Mortimer Taube and Associates 
(37). The serial number is not punched in on a pre- 
determined position but is written on the card according 
to a simple ingenious device. The Uniterm card is 
divided into ten columns marked by the digits 0-9. The 
numbers are entered in these columns according to 
their final digit. For the selection of a document which 
has the desired combination of subjects,, the respective 
Uniterm cards must be visually compared. Only the 
shortest column needs to be considered to find the com- 
mon number and thus over seventy per cent of the en- 
tries can be eliminated from the outset. The posting is 
tedious as numbers have to be written clearly, but Gull 
has developed a technique which, with the help of a 
ticket printer and photography, accelerates the work and 
eliminates mistakes (38). 

For the terminology of the Uniterms, Taube has es- 
tablished a simple rule: "Enter every work in a Uni- 
term coordinate index system as a filing word on a 
single Uniterm card. Whenever in a particular system 
a word is used in one, and only one, descriptive phrase, 
enter that word as the filing word on a card, followed 
by the remaining word or words in the phrase. The 
word or words following the filing word on any card 
will themselves be filing words on other cards. Tf 

The resulting economy in cards as compared with 
traditional subject headings is considerable. The sub- 
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ject headings of the Technical Information Division of 
the Library of Congress were converted into uniterms, 
and 3620 uniterms plus 720 references for synonyms 
replaced the former 25, 000 subject headings and 24, 000 
cross references (39). 

The center field of an edge punched card has ample 
space for conventional recording of information. It is 
also possible to insert a microfilm in this place, a pro- 
cedure which has been developed by Filmsort (40). A 
German documentalist proposed to trace a microfilm on 
a marginal punched card by heliographic methods 
(Lichtpausverfahren) and suggested that reading could 
be accomplished with the help of a simple magnifying 
glass (41). 

Frequently the abstract of a given title or document 
is typed in the center of the card. Many information 
services use this method to give their subscribers ready 
access to the literature of the field. The National As- 
sociation of Corrosion Engineers, for instance, has 
offered an "abstract punched card service" since 1951 
which issues about 2100 cards yearly. Conventional 
McBee cards, 5" x 8" are used with a double row of 
holes. The abstract is condensed into about 200 words 
(42). 

An interesting example of the efficient use of the 
center field for abstracts, charts, drawings, etc. is 
the punched card catalog of aerodynamic measurements, 
published by National Luchtvaartlaboratorium in Amster- 
dam. Marginal punched cards are employed with one 
row of 168 holes around the edges. Direct coding is 
used in forty- seven positions to identify aspects which 
occur frequently: six holes indicate the year of publica- 
tion and twelve perforations are reserved for the first 
three letters of the author 1 s name. Nine holes code 
the sub-group, eighty-four holes are given to superim- 
posed coding of aspects and the remaining nine are kept 
in reserve. As manual sorting of a large file is cum- 
bersome, the cards are pre-sorted in eleven sub-groups 
(43). 

Various methods are feasible for the duplication of 
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marginal punched cards (44). The notches have to be 
reproduced in a separate operation, single cards with 
the hand punch, larger files with a gang punch. Xerog- 
raphy, a dry photocopying method employed by many 
libraries for the reproduction of catalog cards, is a 
frequently recommended method for copying the text. 
The Ozalid process requires a translucent master which 
is copied on sensitized paper: the McBee Company fur- 
nishes both translucent and sensitized punched cards. A 
similar photographic process is Copyflex: a translucent 
master is copied on a punched card sensitized with 
diazo dyes. These cards, too, are manufactured by 
the McBee Corporation. 

Direct transfer of information from a marginal 
punched card to an electric typewriter has been devel- 
oped by Stubenrecht in his Tt Schreibende Randlochkarte tT 
(writing marginal punched cards) (45). 

Marginal punched cards have been applied widely in 
preference to the conventional card catalog to achieve 
better and faster bibliographic control. There are hun- 
dreds of installations all over the world, most of which 
report satisfactory results (46). The majority of 
punched card files is concentrated in the fields of engi- 
neering, the exact and applied sciences and medicine; 
they can be found in numerous libraries especially in 
the Circulation and Acquisitions Departments, and they 
are used in many phases of business, for instance 
warehouse and sales control, personnel work and market 
research. 

The cards are durable (fifty per cent rag content), 
coding mistakes can be easily mended by "card savers, " 
and the life expectancy is probably better than that of 
catalog cards. Filing cabinets are offered by most 
manufacturers of cards, no specially designed drawer 
is necessary, and 800 cards will fit easily in a four- 
teen-inch drawer. Measured in terms of space needs 
punched cards are more economical than the dictionary 
catalog, but extravagant if compared with a printed in- 
dex (47). An average of five cards of the conventional 
size 3 ff x 5" is needed to bring out all the impor- 
tant aspects of one title, occupying 1. 12 cubic inches in 
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the dictionary catalog; a notched punched card of the 
size 5" x 8 fs needs 64 cubic inches: five entries in the 
annual index of the Chemical Abstracts occupy 0. 004 
cubic inches. However, no conclusion should be hastily 
drawn from this comparison because the three types of 
bibliographic control differ in information content and 
ease of retrieval. 

No precise data are available which would enable 
us to calculate even approximately the budgetary provis- 
ions for setting up and using a marginal punched card 
file. Perry and Kent (48) have worked out the basic 
mathematical principles of cost analysis, but their 
thought-provoding research is a report on work in pro- 
gress and not a "definitive statement of findings. TT 
Moreover the authors were interested primarily in the 
basic theory of information retrieval and their equations 
and charts are not meant to be translated into dollars 
and cents. Thorne (49) gave an interesting analysis of 
efficiency (probability of success) and cost (making and 
using the file) but he concluded that "cost figures are 
not representative. . . and no comparison of the various 
systems can be made. " 

A number of the facts involved are not easily 
measured and the results almost defy generalization. 
A good example of the difficulties encountered in 
Marjorie Hyslop's excellent analysis of cost data and 
subsequent breakdown in four categories: 

1. Cost of setting up the system. 

2. Cost of equipment. 

3. Cost of encoding. 

4. Cost of retrieval. 

The first group represents the work involved in 
establishing the classification and determining the term- 
inology of aspects: "the cost is indeterminate but it 
should not be underestimated. " This task demands well 
trained professional personnel and the expenses will be 
accordingly high. Miss Hyslop estimates that the prep- 
aration of the classification for metallurgy took two 
years and "the price would have been high in the five 
figure bracket. TT 
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Equipment can be calculated easily. The cost fact- 
or is almost negligible and a couple of hundred dollars 
will suffice to set up a file of 5000 punched cards (51). 

The cost of encoding, however, brings us back 
again to the realm of question marks. TT No effort has 
been made to determine actual figures for this step 
since they will vary considerably depending on the type 
and size of the file maintained. TT The great variety of 
estimates mentioned in the literature substantiate Miss 
Hyslop's judgement: as a broad generalization, one 
could suggest at least half to three-quarters hour pro- 
fessional work and one-quarter hour clerical work per 
title encoded. 

We are on slightly firmer ground in discussing cost 
of retrieval. Translating the question into the approp- 
riate code is a professional task and may take about 
five minutes. The sorting of the cards has to be made 
in batches of 200. The needling time is about one-half 
minute. However, aligning the cards and returning 
them to the drawer easily takes one half minute: a file 
of 5000 cards will, therefore, need about one-half hour 
sorting time. If more than one needle should be applied, 
the needling time is correspondingly higher. 

Summarizing the above analysis, we must conclude 
that we have exact cost data for the inexpensive items 
but only vague conceptions of the expensive ones. Fur- 
thermore, we have to acknowledge that the establish- 
ment of a punched card file is more expensive than the 
conventional card catalog. Whereas a title can be cat- 
aloged for about $3. 00, the preparation of a punched 
card would be about double that amount per title pro- 
cessed. The reason for this price difference lies in 
the availability of standardized tools such as classifica- 
tion tables, subject heading lists and printed library cat- 
alogs, none of which are applicable to punched cards. 
Therefore, we cannot expect to decrease our cataloging 
costs by substituting punched cards for the conventional 
card catalog. 

Cost factors by themselves, however, are meaning- 
less; they have to be considered in conjunction with the 
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final result, - the successful retrieval of information. 
Bibliographical control by punched cards has certain ad- 
vantages: a greater combination of aspects, multiple 
access points on one card and thus a decrease in the 
size of the file and finally elimination of precise filing 
as the cards are not kept in any specific sequence. 
With the insert of microfilm in the center space of the 
card, its information content will exceed by far the 
specifications given by the descriptive cataloging of the 
conventional card. The combination of notched card 
and microfilm has far-reaching possibilities. 

TT A more recent development in the field of margin- 
al punched cards is the use of sheet microfilm bearing 
ten, or twenty or more frames of text at the usual in- 
termediate reduction ratios, plus marginal notching for 
sorting. This appears to be coming closer to a comp- 
lete cycle searching operation, and with higher reduc- 
tion ratios could store a whole book together with all of 
the sorting aspects in the form of marginal punches" 
(52). 

The disadvantages of hand manipulated cards lie in 
the necessary limitation of the size of the card file and 
the number of aspects. The upper limit is in the neigh- 
borhood of 10, 000 cards; hand- sorting of larger files 
would be too time consuming. It is possible to divide 
a file in sub-groups, but unless these groups are mutu- 
ally exclusive, the advantages of searching aspect-com- 
binations are lost. Instead of a hand needle which ne- 
cessitates sorting in batches of about two hundred cards, 
a simple sorting machine can be employed which per- 
mits sorting of up to eight hundred cards (McBee selec- 
tor: two hundred and fifty cards) with multiple needles in 
one operation. All these auxiliary methods can increase 
the quantity of cards but cannot basically change the in- 
herent characteristics of a limitation in size. 

The restriction in the number of access points can 
be partly overcome by superimposed coding. Whereas 
in direct coding the number of aspects cannot exceed 
the sum total of perforations, superimposed coding does 
not have this limitation. Restraint in the number of 
aspects to be used is necessary, however, lest retriev- 
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al become so cumbersome that all advantages of the 
system are lost. 

Aspect cards presuppose that the books or docu- 
ments are arranged by current numbers. This feature 
limits the application of the method, at least in this 
country, to a collection of documents, reports, reprints, 
etc. Aspect cards could be used to control the subject 
content of storage libraries, but it is doubtful whether 
the infrequency of use, - an assumed characteristic of 
books in storage, - would warrant the expense involved. 
Another disadvantage, at least until Mr. Stern's Micro- 
cite is fully developed, is the necessity to consult a 
shelf list in order to complete the literature search. A 
further criticism has been that only a small part of the 
total card space is being used. Theoretically this meth- 
od has no limitations as to the size of the collection and' 
the number of aspects. A large file of aspect cards, 
however, would make the continuous refiling cumber- 
some, unless we add sorting of the file by marginal 
holes. A large collection of documents would impel us 
to have multiple cards for each aspect, to increase the 
size of the card or to acquire expensive reading and 
punching equipment. As information on the application 
to larger holdings is lacking, we can only state that the 
method has been most satisfactory for small and middle- 
sized collections, (under 1000 aspects; under 25, 000 doc- 
uments). 

Hand manipulated punched cards often have been un- 
favorably compared with machine searching and snob- 
bishly called "the poor man's IBM. " It is true that 
these installations are far less expensive than electron- 
ic equipment but that does not imply that they are a 
cheap substitute. They have been very useful as initial 
experimental steps to be converted later into a fully 
mechanized operation (53), but .this ancillary rSle does 
not give full justice to the merits of the system. If 
management of large masses of material were the sole 
function of modern methods of bibliographical control, 
then hand- manipulated punched cards would have to be 
assigned to a second-rate status. Machine operations 
have no limitations as to the size of the card file; the 
sum total of access points on one card, however, is 
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restricted to the 960 bits on the IBM card. Electronic 
equipment is very fast, but the breath-taking speed of 
the machine becomes less impressive if we heed Shaw's 
repeated warnings "not to confuse part of an operation 
with the whole operation. M 

For the control of a limited number of literature 
references (up to 10, 000) hand- manipulated punched 
cards deserve a leading position. A similar claim can 
be made for the application to a "closed or one time 
project which would not warrant the experimentation in- 
volved in choosing the most suitable machine system 
but in which a nearly ideal setup can be devised in al- 
most no time using marginally punched cards" (54). 

The decided advantages of hand- manipulated punched 
cards are at least fourfold: 

1. Ample space for conventional recording of infor- 

mation which can be read without transcription 
and the possibility of inserting a microfilm. 

2. Ease of operation. Everybody can quickly learn 

how to handle the sorting needle. 

3. Trifling capital investment and inexpensive cur- 

rent costs. 

4. Scant space requirement for the installation. 

Aspect cards permit rapid selection and give under 
certain conditions the most economical operation. The 
arrangement by current numbers is the most inexpen- 
sive shelving system and if the collection is small, di- 
rect access to the literature can be given and the neces- 
sity of consulting a shelf list would be eliminated. 

No system can claim to be the answer to all quest- 
ions, or in all circumstances, and no prediction is 
warranted at present, that a given method will soon be 
obsolete and doomed (55). "A wise literature searcher 
will, therefore, utilize all facilities at his disposal 
from old fashioned catalog cards and conventional index- 
es to the speediest electronic computer. Every method 
has its place and is justified under the proper set of 
conditions" (56). 



Retrieval Systems 33 

The Gmelin Institute is an example of the efficient 
and harmonious employment of all methods of biblio- 
graphical control, and card catalog, hand- manipulated 
punched card, aspect cards, and electronic equipment 
contribute equally to the editing of its famous Handbuch 
(57). In the vast bibliographical organization of the 
Library of Congress almost all known methods of infor- 
mation retrieval are employed; (non- mechanical, semi- 
automatic and fully mechanized); the activities of these 
installations are coordinated by a Committee on Mech- 
anized Information Retrieval. 



Problems for Future Research 

The literature on hand- manipulated punched cards 
is very large and increases yearly at a rapid pace. A 
considerable number of the contributions are of a high 
scholarly level and have greatly advanced our under- 
standing. Nevertheless, many of the basic conceptions 
used are unclear, important parameters have not been 
substantiated by reliable data, and the sum total of 
knowledge available is still only a fraction of the infor- 
mation needed. The situation is typical not only of the 
science of documentation; it is characteristic of all 
fields of intellectual endeavor and inherent in our con- 
ception of scholarship. 

The first group of problems suggested for further 
investigation deals with basic research. Many of the 
questions in this category will remind librarians of the 
function of the catalog: Who needs information and what 
type of information? Furthermore, if it is correct, as 
it seems to be, that the patrons have to be classified 
in specific categories, what are the characteristics of 
each group, expressed in terms of information needed? 
Are our information systems geared to function in the 
exceptional cases or are we satisfied to meet the aver- 
age demand? It has been pointed out during the recent 
International Conference on Scientific Information, that 
eighty-nine per cent of the search questions involve 
three aspects or less. Is this judgment based on a 
valid statistical analysis of all categories of users? Are 
the information systems used and the terminology of doc- 
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umentalists needlessly complicated and would we have 
satisfactory results by employing less elaborate ma- 
chines and simpler language? 

The second group would treat comparisons between 
the different systems. What are the standards of com- 
parison? What is the definition of a "good" retrieval 
system? What are the exact data of speed and depth of 
a completed retrieval? What is the break- even point 
between the different methods (58)? What are the char- 
acteristic qualities of a given method and for which sit- 
uation can it claim preference? 

The third group would be confined to the specific 
problems of the hand -manipulated punched cards. What 
is the optimum size of the card file and the sum total 
of aspects? How many access points can the single 
punched card carry without making retrieval unduly cum- 
bersome? How great is the speed of retrieval tested 
under varying circumstances? What is the optimum 
space relation between coding area and test? The ad- 
vantages of the different coding systems in relations to 
speed and depth of retrieval need better experimental 
substantiation. More information is needed about the 
efficiency of multiple rows of perforations. 

Little theoretical work has been done on slotted 
cards and on the combinations of slots and marginal 
holes. Aspect cards have been tested in small collect- 
ions but information is needed on their adaptability to 
large masses of literature. 



Conclusion 

It is imperative that librarians participate in re- 
search on modern methods of bibliographical controL 
Methods which have been worked out in a non-library 
situation cannot be adapted without serious disadvantages. 
"The problems of a library are, for the most part, 
unique to a library. They should be attacked only by 
persons who are willing to use them as being unique and 
to prescribe for them in uniquely suitable terms. There 
is no reason to think that machines or methods designed 
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to serve other purposes will be of much direct use to 
the librarian" (59), 

Careful recruitment of qualified personnel, adapta- 
tion of library school curricula, and increased opportun- 
ities to test the new methods in libraries are essential 
to stimulate research. Moreover, the profession as a 
whole must be a receptive audience for the studies and 
follow with interest all developments in this important 
segment of librarianship. 



Equipment 



United States 

Arizona Tool and Die Company (Boekeler Instrument 
Company). Tucson, Arizona. Trade name: Needle- 
sort 

Edge punched cards with perforations around three 
edges are manufactured in two standard sizes: 
3 1/2" x 6" (48 holes) and 5" x 8" (68 holes). The 
larger card is available in four colors. The card 
can also be bought with perforations around the 
four edges (98 holes). The coding is direct and 
numerical sequence code. 

Burroughs Corporation -- Todd Company Division, form- 
erly Charles R. Hadley Company, Los Angeles, 
with many regional representatives. Trade name: 
UniSort 

Standard cards available are edge punched cards 
with one row of holes, 4 holes to one inch, in siz- 
es 3" x 5" up to 6 Tt x 8". For installations which 
use cards in quantities of 50, 000 up special cards 
are almost the same price as the standard ones. 
Used frequently for accounting procedures. A "uni- 
versal" library card has been designed by M. E. 
Putnam (University of Washington). 

The company has been formerly connected with 
the McBee Corporation and produced the "Rocket" 
card. This card is no longer manufactured. 

Documentation Inc., Washington, D. C. 

The firm has developed the Uniterm or Coordinate 
Indexing System. All installations are tailored ac- 
cording to the specific needs of the given collection 
and are carefully supervised. The firm is interest- 
ed in all types of subject cards. 

36 
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E - Z Sort System, San Francisco, California (60) 

Edge punched cards are available in a great variety 
of sizes up to 8 n x 10 l/2 n with one to four rows 
of small holes, 6 to one inch. Holes are staggered 
along the margin and not in one straight line. All 
four coding systems are applicable, however E - Z 
cards are at their very best in direct coding. The 
multiple row arrangement permits the coding of the 
greatest number of non- exclusive aspects per inch 
of edge space of all systems. Available is also a 
combination E - Z Sort and IBM card. 

The system is widely used in research files. 
The card with a combination of double and triple 
rows at the Oak Ridge Laboratory mentioned above 
is E - Z Sort. (21) Other important applications 
are: The American Society of Metals, Special Li- 
brary Association Metallurgical Literature Card, 
The Paint and Varnish Literature Card, The Illinois 
E - Z Sort Anaesthesia Record Card, etc. 

Frazier Precision Instrument Company, Silver Spring, 
Md. 

Card punch and reader for the Peek-a-Boo system 
of the National Bureau of Standards. 

Gaylord Bros. Inc., Syracuse 

Conventional 3"x5" library cards can be furnished 
with 6 holes punched at top and bottom edge. They 
can be used for direct coding and numerical se- 
quence code. The cards are applicable for circula- 
tion control in small or middle sized libraries,, 

International Business Machines, Inc., New York 
Peek-a-Boo cards with 480 perforations. 

Jonker Business Machines, Washington 15. Trade 
name: Matrex 

Peek-a-Boo cards with a capacity of 10, 000 or 
40, 000 positions. A previous model with 15, 000 
positions is no longer recommended. The firm is 
also available on a consulting basis for the instal- 
lations of all types of machine retrievals. 
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Le Febure Corporation, Cedar Rapids, Iowa. Trade 
name: X-Ray Sort Card 

Cards in varying sizes can be supplied up to 6 1/2 " 
x 10 3/4 TT . Holes are punched in one row, four 
holes to one inch. Coding is direct, in some in- 
stances the sequence code 7421 has been em- 
ployed. 

The system is geared for the control of busi- 
ness records. 

McBee Corporation, see Royal McBee 

Remington Rand, New York 19 

Peek-a-Boo card with 640 perforations. 

Royal McBee Corporation, Athens, Ohio, with many 
regional representatives. The Corporation was 
formed in 1954 with the merger of the Royal Type- 
writer Company and the McBee Company; McBee 
cards have been manufactured since 1933. Trade 
name: Keysort 
Marginal punched cards. 

Cards are sold in varying sizes from 2" x 
3 1/2" to 8" x 10 1/2", also larger cards can be 
supplied. They are preperforated around the edges 
with a single or double row of holes. The holes 
are spaced either on 1/4" centers, or on 2/10" 
centers, which gives four or five holes respectively 
to one inch. Recently interior punching has been 
added and automatic data processing can be achieved 
with the Keysort Tabulating Punch. The largest 
manufacturer of marginal punched cards in the 
United States. McBee cards are so widely used in 
industries and colleges (both administration and li- 
brary) that the name has become almost synony- 
mous with edge punched cards. 

Superior Business Machines, Inc., New York 17 (61) 
Trade name: Flexisort 

Does not use preperforated cards. All cards can 
be used, even existing records can be converted in- 
to a marginal punched card file. The Flexisort 
machine punches the holes and codes by notching in 
simultaneous operation. 32 holes can be punched on 
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any one side, (total: 128 holes), five perforations 
measure one inch. Coding is direct, the alphabet- 
ical and numerical sequence code 7421 can be 
employed. 

The machine is available on an annual rental 
basis. 

Underwood Corporation, Samas Division, New York 

Peek-a-Boo card with 210 and 400 perforations re- 
spectively. 

William K. Walthers, Milwaukee (62) Trade name: 
Findex System 

Slotted cards. 2 cards are available in an assort- 
ment of colors: 6" x 8 n and 8" x 8". Special 
cards are designed for every Findex Installation. 
(Not manufactured at present. ) 

Wassell Organization, Inc., Westport, Conn. 

Produces a vinyl plastic card 5" x 8 Tt (trade name: 
Plas-Ta Card) used in the Peek-a-Boo system at 
the National Bureau of Standards. 

Zator Company, Cambridge 38, Mass. (63). Trade 
name: Zatocoding System 

The system employs preperf orated cards with forty 
or seventy-two holes respectively. Both cards 
measure 5 TT x 8"; one card carries holes on the top 
margin, the other one both on top and bottom mar- 
gins. 

The method has many interesting features, but 
two must be specially emphasized: Superimposed 
coding based on the random selection of four holes 
per code position; the use of "retrieval" language 
for the descriptors instead of "communicative" 
language. Mr. Mooers, the inventor of the system, 
does not believe that conventional library classifica- 
tion and subject headings are compatible with suc- 
cessful retrieval. 

Selection is made with the Zator Selector. 

All installations are tailor-made and carefully 
supervised by the company. 

The equipment is provided on a rental basis. 
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England 

Brisch and Partners, Ltd. , London and Toledo (64) 
A consulting form for all types of information re- 
trieval and classification. 

Carter-Parratt, Ltd., London SW1. Trade name: 
Brisch- Vistem 
Peek-a-Boo card, 6" x 11", capacity: 1000 positions. 

Copeland-Chatterson Company, Ltd., London (65) 
Trade name: Paramount Punched Card System 
Cards are available in a great variety of sizes, 
mostly with one row, some with a double row of 
prepunched holes. Two sizes of holes, 4 or 5 to 
one inch. Coding is direct or combination code. 
For numerical sequence sorting three code fields 
can be supplied; a ten position field, 0-9; a six 
position field, 0-5 and the conventional 7421 
field. For alphabetical sequence sorting two alpha- 
betical fields are available with 12 or 15 positions 
respectively. 

The largest English manufacturer of marginal 
punched cards and the pioneer in this field. 

France (66) 

Compagnie des Fichiers Modernes, Paris 12. Trade 
name: Rapidtri 

Marginal punched card; six sizes are available from 
3 n x 5" to 7 3/4" x 10 1/2". They have one or 
two rows of preperforated holes, five holes to one 
inch. 

Direct coding and combination coding are ap- 
plicable; besides the four position numerical se- 
quence code, there are two alphabetical codes with 
18 and 4 positions, 

Dequeker S. A. , Paris 

The fabrication of slotted cards and of the Selecteur 
has been abandoned for the time being. 
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Societe Detectri, Paris 2. Trade name: Sphinxo 

Peek- a- Boo card, 5 1/2" x 10 1/2", capacity: 1000 
positions. 

Trade name: Statitex 

Marginal punched card; a variety of cards is avail- 
able from 3" x 5" to 8 1/2" x 10 1/2" with preper- 
f orated holes ranging from 48 to 134, mostly in one 
row but some with a double row of perforations. 
Trade name: Detectri 

Slotted card, available in three sizes from 6 1/2" x 
9 1/2" to 8 1/2" x 10 1/2". The coding field is on 
the right part of the card; the coding positions con- 
sist in a horizontal slot. 

Societe Microdoc, Paris 13. Trade name: Selecto 

Peek-a-Boo cards; great variety of sizes and capac- 
ities. 

2000 punching positions 
5000 punching positions, 3 1/4" x 7 1/2", zigzag 

pattern 

8000 positions, 8 1/4" x 6" 
12, 500 positions, 8 1/4" x 6", on plastic 
14, 000 positions, 8 1/4" x 6" 

20, 000 positions, 8 1/4" x 6", on plastic (in prep- 
aration) 

Societe Selection, Vanves, Seine. Trade name: Selectri 
Slotted card. Standard size: 3 1/2" x 7 1/2", var- 
iations are available, however not exceeding 7 1/2" 
x 10 1/2". 

Coding position consist in a horizontal slot. 



Germany (67) 

Aliform, Berlin W 15 

Slotted card. Offered in one size with four capac- 
ities, 72, 96, 204, 306 perforations. 

The upper part of the card is reserved for con- 
ventional recording, the lower part is the coding 
field. Coding position is a vertical slot. 

Peek-a-Boo card; 8 1/4" x 6" with 1860, 2000, 
6000 positions respectively. 
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Edler & Krische, Hannover. Trade name: Ekaha 

Peek-a-Boo card; 8" x 12" with 7000 punching po- 
sitions. 

Marginal punched card,, available in varying 
sizes with one or two rows of preperforated holes 
all around the edges. In addition to all conventional 
coding methods an "additive" code is offered. This 
method is suggested when selection of a given num- 
ber is more important than sequence sorting. The 
coding section consists of three coding fields with 
eight positions arranged in two rows each. The 
positions are marked 1247 and the fields are 
designated as in a sequence code with: units, tens, 
and hundreds. 

Slotted cards. Size 8 1/4" x 6"^ upper part 
for conventional record, lower part for coding, 210 
perforations arranged in eight rows. The coding 
position is a vertical slot. 

Offered also is a combination edge punched and 
slotted card. The slotted card is the same as de- 
scribed before, but has on the upper edge two rows 
of holes. Another combination has a single row of 
holes on the top and halfway down both sides. 

Integral, Duesseldorf 

Marginal punched card, 8" x 3" with one row of 
holes around the edges, mainly for bookkeeping. 

VEB Organizationsmittel-Verlag, Leipzig 
Marginal punched cards. 

Available in four sizes ranging from 4 1/4" x 2 3/4" 
to 11 1/2" x 8" with One or two rows of perfora- 
tions around the edges. Four holes to one inch. 

Slotted cards same formats as above. 

Coding field on the lower part of the card, 340 
perforations arranged in 10 rows. 

Lochkartenwerk Schlitz, Schlitz, Hessen 
Marginal punched cards. 

17 different varieties available with one or two 
rows of holes around the edges, four perforations 
to one inch. Seven colors are offered. 

Two combinations of Peek-a-Boo and edge 
punched card: 



Retrieval Systems 43 

a. 48 perforations on two sides and 800 

positions 

b. 52 perforations on two sides and 2000 

positions 



Italy (68) 

Samo, Milan. Trade name: Selez 
Slotted card. 

Different sizes from 4 TT x 2" to 6 3/4 TT x 5 3/4" 
with one or two rows of slotting positions on the 
upper and lower edge of the card. For bookkeep- 
ing purposes only. 



Japan (69) 

Gaikoku Bunken-Sha, Tokyo 
Marginal punched cards. 

Bunshodo, Tokyo 

Marginal punched cards. 



Netherlands (70) 

Semper Avanti, The Hague. Trade name: Delta Card 
Peek-a-Boo card. 

The card is 12 1/2" x 9" and has a capacity for 
10, 000 positions arranged in a zigzag pattern. 



Poland (71) 

As far as I could ascertain American type punched 
cards are used. 



Sweden 

Esselte, Stockholm. Trade name: Sorto 
Marginal punched card. 
Different sizes with one row (in a few cases with 
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a double row) of preperforated holes around the 
edges. All coding systems applicable, selection by 
needle. Mostly for business purposes. 



U,S.S,R. (72) 

As far as I could ascertain Russian documentalists are 
mainly interested in high speed electronic machines. 
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Table 1 

Summary of Some Typical Industrial and Governmental 
Mechanical Information Processing 
and Retrieval Activities 

from 

Kent, Allen & James W. Perry - Centralized Infor- 
mation Services. Cleveland, Press of Western 
Reserve University, 1958. Table 14. 



Large Industrial Organization 

Large Industrial Organization 

Medium Sized Industrial Or- 
ganization 

Medium Size University 

Research Institute 

Dept. of Chemistry, Univer- 
sity 

Large Industrial Organization 

Dept. of Biochemistry, Univer 

sity 

Large Industrial Organization 
Large Industrial Organization 

Large Industrial Organization 
Government Agency 
Government Agency 

Large Industrial Organization 
Large Industrial Organization 
Large Industrial Organization 
Municipal Police Dept. 

Large Industrial Organization 

Government Agency 
Large Industrial Organization 
Large Industrial Organization 
Large Government Project, 
University 



IBM 
IBM 
E-Z Sort 

Cards 
IBM 
IBM 
Keysort 

Cards 
Keysort 

Cards 
- Keysort 

Cards 
IBM 
Keysort 

Cards 
IBM 

Peek-a-Boo 
Keysort 

Cards 
IBM 
Keysort 
IBM 
Remington 

Rand 
IBM 

IBM 
IBM 
IBM 

IBM and 

Keysort 

Cards 



Well Satisfied 
Well Satisfied 
Indeterminate 

Well Satisfied 
Well Satisfied 
Well Satisfied 
Well Satisfied 

Well Satisfied 
Indeterminate 



Indeterminate 
Indeterminate 

Well Satisfied 
Partly Satis. 
Well Satisfied 
Well Satisfied 

Appear Well 

Satisfied 
Indeterminate 
Well Satisfied 

Well Satisfied 
with regular 
file Uniterm. 
Dissatisfied 



with experimental Keysort files 



46 State of the Library Art 

Table 2 

117 Retrieval Systems 
Tabulated From 

Kent, Allen, Nonconventional Retrieval Systems 

in Documentation. Cleveland, School of Library 

Science, Western Reserve University, 1958 

(Air Force Office of Science Research, Technical 
Note 3) 

System Number of Installations 

IBM or similar 57 

IBM with Peek-a-Boo 1 

Peek-a-Boo 3 

Dequeker 4 

Zator 1 

E-Z 5 

Uniterm 11 

Uniterm M 1 

McBee or similar 34 
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Table 3 

24 Retrieval Systems 
Tabulated From 

National Science Foundation, Non Conventional 
Technical Information Systems in Current Use. 

Washington, National Science Foundation, 1958 

System Number of Installations 

Zator 1 

Uniterm 5 

Uniterm M 4 

Peek-a-Boo 1 

McBee 2 

IBM 11 
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Volume Four Part Two 

FEATURE CARDS 
(Peek-a~Boo Cards) 

by 
Lawrence S. Thompson 



1. Definition 

1. Some American documentalists have used the 
rather ludicrous terms of TT peek-a-boo t! (61), "peekable, TT 
or TT peephole TT for the type of punched card to be dis- 
cussed in this essay. Other English-speaking document- 
alists have played with such other terms as "super-im- 
posable/' "optical," or "coineidentally" punched cards. 
Foreign terminology is less uncertain: TT fiches super - 
posables Tf (Fr.), TT sichtlochkarten TT (Ger.), "titthalkoft" 
(Sw. ), or TT onderwerpponskaarten rT (Du. ). They are also 
called Cordonnier or Batten cards after two of their 
leading exponents. The term "feature card" is advocat- 
ed by J. L. Jolley (34) and J. Edwin Holmstrom (cor- 
respondence with the writer). Contrary to other punched 
card systems that use one card for one document or one 
item, the feature card system uses one card for one 
subject^ characteristic, aspect, or feature. Hence the 
more exact (and more dignified) name will be used here. 



2. Horace Taylor T s Patent 

2. The earliest record of the use of feature cards 
is 1915, when Horace Taylor of Brookline, Mass., pat- 
ented a "selective device" for the identification of birds 
(56). He used a foundation sheet bearing the names of 
birds in combination with "screen sheets" showing such 
characteristics as having topknot, perching on a branch, 
medium in size, and blue or blue-gray. Each screen 
sheet is perforated over the positions on base sheet if 
the bird on the base sheet has these characteristics. 
When the four screen sheets thus perforated are placed 
over the base sheet, the name of the blue-jay only may 
be read, since it is the only bird combining all of these 
characteristics. Although Taylor used a base sheet 
rather than coding his perforations, he hit upon the 
basic principle of feature cards that has not been changed, 

57 
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Four decades later there was a return to the notion of 
a base sheet in greatly refined fashion, microcite (47), 
infra. 

While the illustration on Taylor's original patent 
showed base sheets of only thirty-five birds he stated 
that the perforations on each of the screen sheets were 
Tt so arranged" that the screen sheet could be superposed 
on the foundation sheet in four positions, that is, rever- 
sible top for bottom, or recto for verso (with both top 
and bottom aspects on the verso). This arrangement 
increases the potential of Taylor's "selective device" 
fourfold. Taylor does not describe any equipment for 
making his rectangular perforations. 

Taylor does not describe the arrangement of his 
screen sheets. While classification or other arrange- 
ment of the feature cards is, obviously, an essential to 
their effective manipulation, (8; cf. also 4), the theoret- 
ical aspects of the problems of classification belong in 
another essay and will be treated here only as they may 
affect mechanical aspects of searching. 



3. H. E. Soper T s Patent 

3. In 1920 Herbert Edward Soper of London, 
England, patented another feature card system (52). He 
does not suggest any specific use in his patent other 
than for "tabular and statistical data" in general. He 
does not use a base sheet. His card, as illustrated in 
the patent, shows only 130 positions but there is no im- 
plication that the size of the card is limited to this 
number. He uses simple circular perforations made by 
an awl-like perforator. He suggests a luminous screen 
at the back of the superposed cards, a readout device 
still widely used. He says categorically that his device 
"possesses great advantages. . . over systems of indexing 
employing cross references or multiple entry. " He 
emphasizes the savings in cost and material. Soper' s 
invention is the classic statement of the feature card in 
generalized form. It is curious to note that neither 
Soper nor Taylor seem to have had any subsequent in- 
fluence, and they have only been noted historically in 
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the accelerated interest in feature cards that has de- 
veloped in the post- World- War II era (e.g., 31, 60). 



4. The Borgeaud-Liber Patent 

4. A third unrelated patent, taken out in France 
by Borgeaud et cie. and Henry Liber (6) in 1924 did 
bear fruit in the work of Cordonnier some two decades 
later (3, 30), infra. The Borgeaud-Liber patent is 
quite similar to Soper' s, and it uses examples of iden- 
tification of individuals Although Soper did not imply 
that his cards were positively limited to 130 positions, 
it is significant to note that the Borgeaud-Liber cards 
show 1, 000 positions. This patent shows a new refine- 
ment in the use of different colored cards for chronolog- 
ical identification. He cites the example of a record of 
orders by a business firm in which a given individual 
may be identified as a client, while the position to which 
he is assigned is backed up not by a transparent perfor- 
ation, but by a color showing the penultimate month to 
the month of his last order. Holmstrom (22) expressed 
this use of color in more general terms. If we need 
to ascertain documents which satisfy criteria A, B, and 
C, but not Z, we need simply to put Z at the bottom of 
the superimposed pile and cover it with a transparent 
colored plastic (cf. also 7). Another use of color, of 
course, is to differentiate various series of references 
after the maximum number of positions on one series of 
cards is used (4, 22, 27, 43), 



5. Feature Cards for Mineral Identification 

5. In August 1920 C. J. Gray > a geologist in the 
service of the Zululand and Natal Mines Departments, 
read a paper to the Geological Society of South Africa 
on the use of physical characteristics of minerals for 
their identification (25). There is no evidence that he 
was acquainted with the work of Taylor and Soper, and 
for the next quarter of a century mineralogists operated 
independently, insofar as their publications (16, 17 X 18, 
21, 28) indicate, of the ideas of the three early patent- 
ees. The problem faced by mineralogists is that some 
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rare mineral, or some deviant or unusual form of a 
common mineral, may resemble a sample. This pos- 
sibility may be ignored in confirmatory tests. Gray's 
experience with tables of characteristics for identifica- 
tion of minerals had been unsatisfactory, since they do 
not invariably allow the scheme of examination to be 
adapted to a particular specimen. 

Gray set up a list of 361 minerals and presumably 
had only that number of spaces for perforation, or per- 
haps a few more. He selected 66 characteristics, such 
as luster, color, cohesion, hardness, effect of heat, 
specific gravity, and crystallization, and subdivisions 
under each. Like Taylor, he used a base sheet. He 
was cautious in his claims for his system: "No claim 
is made that every mineral can be determined exactly 
by use of the set of sheets alone, but such use will 
rapidly so reduce possibilities that almost invariably the 
exact identification will be clear to a man with a good 
knowledge of minerals. . . TT 

Gray had no more direct influence than Taylor or 
Soper; and when J.D.H. Donnay, a Belgian mineralogist, 
first explained his system sixteen years later (17), he 
seems to have been ignorant of Gray's work u Donnay, 
who worked out his system with J. Melon of Liege, op- 
erated with cards with 330 positions (representing 
some 360 species) in a fashion almost identical with 
Gray T s. He is more positive than Gray about the ac- 
curacy of his system. He argues that if two or three 
positions are open at the end of the operation, the se- 
lection of supplementary properties will eliminate all 
but one. In a later article for American mineralogists 
(16) he was more conservative and warned that it was 
hard to say that a mineral has no cleavage from exam- 
ining one specimen only. 

Donnay made his cards from Manila cardboard and 
filed them in a wooden box with an obliquely cut lid. 
Donnay and Melon also made a larger set, using 210 
cards for characteristics, but it seems to have been too 
expensive to produce in quantity (18). 

Hurlbut (28) developed a system of mineral ident- 
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ification on edge- notched cards, with each card repre- 
senting a mineral rather than a property of a mineral. 
He said that it was easier to separate by properties 
according to his system than to thumb through the list 
of properties represented by each of Donnay' s feature 
cards. He also pointed out that his cards could be 
thrown together in any order after use. Wachtel (57, 
58), infra, rejected edge-notched cards for identifying 
properties of nuclides, since the total bulk of the cards 
would be twice as great if she had to provide one card 
for each document. The same is true of Hurlbut T s sys- 
tem, except that he would have approximately six times 
as many cards in his pack as Donnay had in his, if 
there must be a card for each of 361 minerals. 

Fairbanks (21) developed a somewhat more elabor- 
ate system than Donnay 1 s for the identification of non- 
opaque minerals, especially the fine intermixed and dis- 
seminated forms of ores. He used standard tabulating 
machine type cards (Powers 1060, 7 3/8" x 3 1/4", pur- 
chased from Remington-Rand), but he assigned only 356 
of the 540 positions to known mineral species. There 
are 117 cards in a set. Like Donnay, Fairbanks ar- 
gues for the superiority of the feature cards over con- 
ventional tables. 



6. W. E. Batten 1 s Feature Cards 

6. The work of the mineralogists with feature 
cards attracted little, if any, attention outside of their 
ov?n field. The present enthusiasm for feature cards 
may be attributed mainly to the work of W. E. Batten 
(3, 4), the head of the Intelligence Department of the 
Plastics Division of Imperial Chemical Industries, Lon- 
don, and G. Cordonnier, a former professor of math- 
ematics and chief engineer for the G&iie Maritime, 
Paris (8). Resumes of the work of Batten and 
Cordonnier, with notes on the later development of their 
work, may be found in Holmstrom T s articles in the 
F.I. D. Manual on Document Reproduction (22), in his 
articles in the UNESCO Monthly Bulletin on Scientific 
Document Reproduction and Terminology (mainly 54), 
and in the second part of his work on Facts, Files, and 
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Action in Business and Public Affairs (27). 

Batten came face to face with the problem of in- 
dexing patent literature pertinent to his field in 1939 
when the British Patent Office ceased publishing its 
Official Abridgements. Due to the relative inflexibility 
and expense of high-speed mechanical devices for se- 
quential handling of large collections of data, Batten had 
to turn to some other device to ascertain patents which 
dealt with specific topics. He first developed feature 
cards providing spaces for 400 documents, numbered 1 
to 400. When this series was complete, he used a 
second series with positions numbered 401 to 800, and 
so on until he could provide for 4, 000 items, the tenth 
card in the series providing for positions 3, 601 to 4, 000. 
By this time the basic limitation of the primitive feature 
card, namely, the number of documents that could be 
searched in one operation, became apparent. Batten 
then substituted for his eleventh and subsequent series 
of documents a standard Hollerith (IBM) card with 800 
positions. (Note that these cards could not be used for 
machine selection with Batten 1 s notation, that they were 
only a handy, readily available medium for a feature 
card system. ) 

Batten stated categorically that "No amount of 
mechanical aid will make up for a defective classifica- 
tion system. " He made a genealogical breakdown of the 
subject matter into classes of aspects and subclasses. 
Then he used a suitable decimal classification so that 
each major or minor aspect class was coded with a se- 
quence of digits. Batten's rather classical treatment of 
his subject matter illustrates a specific problem that 
every user of feature cards must face and solve. 

Holmstrom (22) suggests various forms of headings 
for a subject classification of feature cards. One sug- 
gestive point is that the "facets" in Ranganathan r s Colon 
Classification might lend themselves to this treatment. 

Batten and others recognized serious limitations to 
his system. Batten pointed out that the speed of search- 
ing is slow, less than in a fully mechanized system, and 
he stated that his system is "essentially for the small- 
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scale operator" (3). He said that it saves technical 
man-hours but not clerical man-hours. A graver lim- 
itation is that only so many documents can be recorded, 
a more aggravated problem with him than with 
Cordonnier and others, since we will note cards provid- 
ing for up to 40, 000 documents. Feature cards have 
been contrasted with another widely used manual punched 
card system, edge-notched cards (27, 54; cf. also par- 
agraph 5, supra). While the number of documents that 
can be indexed by the latter is theoretically unlimited, 
the number of subjects is limited by the number of 
notches that can be placed on the edges. In the case 
of the former, the number of subjects to be indexed is 
theoretically unlimited (Batten used about 1, 000 cards 
per series [22, 27]), the number of documents that can 
be indexed in a single series is limited to the number 
of punchable positions on the feature card. 

Batten (3) did not think his system could be used 
with a large number of references. Nevertheless, he 
had some vision of wider application, suggesting that a 
different mechanical form of the same principle would 
be the solution. For example, it might be possible to 
replace feature cards with feature strips, "the latter 
consisting of perforated ribbons wound on spools and 
adapted to be run through a scanning device whilst sup- 
erimposed. TT Or the signalling medium need not be a 
perforation on a ribbon, but a mark on a film or an 
impulse on a sound track. 



7. G. Cordonnier and 
Developments in France 

7. G. Cordonnier gives credit to the Borgeaud- 
Liber patent in his fundamental study of feature cards 
(8), stating that Sphinxo cards produced by Detectri, 
68 rue de Richelieu, Paris (53), incorporated the pos- 
sibilities in the patent after it lapsed into the public 
domain. While Taylor, Batten, and the mineralogists 
were all attacking a specific problem of selection, the 
Borgeaud- Liber patent (like the Soper patent) is couched 
in more generalized terms, and Cordonnier f s uses for 
feature cards are broader than any previous practical 



64 State of the Library Art 

application. Cordonnier first developed the Sphinxo 
card (8, 53) with 1, 000 positions. Although Cordonnier 
(8) expressed dissatisfaction with the limited number of 
positions, this card still has practical uses in instances 
in which the cases are usually limited to less than 
1, 000 (e. g. , medical records, industrial accidents, and 
even limited bibliographies; see 53). These cards are 
available in seven colors, and a mechanical perforator 
and reading frame (for illuminating the "through holes") 
are available, Detectri's promotional brochure (53) is 
especially persuasive in its description of non-documen- 
tary applications of Sphinxo cards. 

For most documentary purposes 1, 000 positions 
are insufficient, and therefore Cordonnier developed the 
Selecto card (produced by Societ Microdoc, 9 rue 
Rubens, Paris 13 e ) with 2, 000 positions and a perfor- 
ator to go with it (8, 49, 50). However, it was nec- 
essary to cover much larger collections of documents, 
and the next step was to expand the Selecto cards to in- 
clude 12, 500 positions. Moreover, this new type of 
card allowed the subsequent introduction of new entries, 
for positions could be left to provide for this contin- 
gency (22). Cordonnier worked out a card 15 x 21 cm. 
(5 1/8" x 8 1/4") printed like graph paper (with positions 
on x from to 99, on y from to 124). They were 
printed in eight different colors so that a total range of 
1000, 000 documents could be covered in eight series 
(27). A special perforator, Per Selecto, and a reading 
frame, Sta Selecto, are available from Microdoc. The 
small size of the positions for perforations were a 
source of some concern, since it was feared that shrink- 
age and expansion might vitiate the fine dimensions of 
the 12, 500 position card. Cordonnier first experimented 
with plastic sheets, but later he found a special cellu- 
lose material which is reported to be satisfactory (27, 
50). 

Holmstrom (27) reports six different applications 
of Cordonnier 1 s system in Paris. Cordonnier himself 
used it at the Naval Ministry as the basis for a tech- 
nical service. The Mineralogical Laboratory at the 
Sorbonne used Cordonnier 1 s system for crystallographic 
identification; and Holmstrom emphasizes the economies 
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in space when feature cards are used to index refer- 
ences to organic chemical compounds in general (one card 
for each kind of atom in the molecular diagram instead 
of thousands for various substances). The judicial iden- 
tification system has put over I, 500, 000 fingerprints 
and other criminal data on feature cards, and searches 
are now said to be much more rapid than formerly. 
The Centre National des Recherches Scientifiques index- 
es and searches its files of unpublished translations 
with feature cards. The Union Fran^aise des Organ- 
ismes de Documentation uses feature^ cards as a means 
of ascertaining which of several hundred libraries and 
documentation centers in France offer such services as 
microfilming, publishing abstracts, translating, etc. 

The most striking use of Cordonnier's system is 
by the Institut des Fruits et Agrumes Coloniaux, 6 rue 
de General Clergerie, Paris 16 e (9, 22, 27, 29, 30). 
Originally the Institute used the 2, 000 position cards, 
but it soon changed to the 12, 500 position cards. The 
Institute indexes 350 current periodicals, using a special 
type of subject classification developed by Cordonnier 
(8, 39). At the end of each year, or when all positions 
on an aspect card have been used, the Paris office 
punches out several sets. Thus the colonial branches 
are able to conduct their own literature searches and 
request microfilm of needed articles from the Paris 
office. Punching several sets is said to be quick and 
cheap. Holmstrom (22) says it saves the expense and 
delay of compiling and printing indexes to abstract bul- 
letins. Further, he says, it might make possible the 
consolidation of references in several different abstract- 
ing organs or bibliographies. 

Another biological application of feature cards was 
developed by Holmstrom for the Fisheries Biology 
Branch of the Food and Agriculture Organization (26). 
The original report with details of the recommendation 
was not available. 

Holmstrom (22) records certain apparent object- 
ions to Cordonnier T s system. While the number of doc- 
uments that can be searched in a file is limited, this 
is compensated by the extreme quickness of the search 
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and the fact that in a large organization, different sets 
can be searched simultaneously at different desks. Lin- 
ing up pinpoint perforations (0.7 mm in diameter and 
1.4 mm apart) is said to be time-consuming and pos- 
sibly lead to error; but the manufacture of the cards 
(Cordonnier's are cut to 0.001 mm precision under con- 
trolled temperature and humidity), the accuracy of 
punching devices and readout devices, and, above all, 
practical working experience over several years, indi- 
cate that the system is quick and accurate (8, 50). 
Holmstrom thinks there is no difficulty involved in de- 
veloping a systematic filing system, since cards can be 
filed in an arbitrarily serially numbered order with fil- 
ing numbers written in on the margins of an outline of 
the classification, or any desired classified order; or 
verbal descriptive headings may be used. They can al- 
so be edge-notched for purposes of arrangement, a sug- 
gestion also made by Kistermann and Uhlein (37). There 
is the objection that it is necessary to go to another file 
to find even the title, but Holmstrom proposes that a 
file of microfacsimiles of indexed articles be kept within 
reach. Microcite (47), infra, is another possible 
answer. It is said that it is hard to search completely 
for any general subject, but Holmstrom thinks that the 
converse is true, since each document can be punched 
on the feature card or cards representing its most 
specific subject, or subjects. 

One grave obstacle is the high labor cost for in- 
dexing a large number of articles and the necessity of 
using meticulous workers (22). Two persons working in 
tandem cannot punch more than 1, 500 holes in a work- 
ing day. For as many as 3, 000 punchings two teams 
of two workers must be used, one working with odd- 
numbered feature cards, the other with even-numbered 
ones. 

It should be noted that Microdoc also produces 
8, 000 and 14, 000-position cards, 10, 000 and 20, 000- 
position plastic sheets, and 5, 000-position cards in 
standard IBM format which can be filed in IBM equip- 
ment (54). The latter has a space-saving rhomboid 
pattern of positions similar to the Delta cards produced 
by Semper Avanti (10, 43, 44), infra. Microdoc fur- 
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nishes books with thumb indexing for filing the cards. 
There is a hand punch for the 8, 000 position cards, 
but for the others there is a specially designed punch 
with a screw adjustment in two dimensions to enable 
quick and accurate location of very small holes (54). 

No mechanical problem has been reported in the 
use of cards with very large capacity when a precision 
punch is used; and, clearly, they increase the number 
of items processed by a single operation. However, 
they do not allow the identification of required items by 
negative characteristics (22, 7, and supra, paragraph 
4), since reflected color will not show through them. 



8. The Netherlands 

8. The feature card has been used extensively in 
the Netherlands, and some new advances have been 
made there. In industry it is used by the Patent Divis- 
ion of Philips' Gloeilampenfabrieken, Eindhoven (59), 
Hollandsche Signaal Apparatenfabriek (special cards; de- 
tails unavailable), and Allgemeene Kunstzijde Unie, 
Arnhem (Delta cards, infra). Th. P. Loosjes of the 
Centrum Voor Landbouwdocumentatie, Gen. Foulkesweg 
la, Wageningen, one of the foremost theorists on the 
use of punched cards in documentation, reports the use 
of Delta cards in the Institute for Land and Water Man- 
agement Research and in agricultural field experiments 
in Wageningen. In a private communication he says he 
has used Sphinxo cards but he emphasizes their limita- 
tion to 1, 000 positions. 

J. Westendorp (59) applies feature cards to patent 
literature at Philips in much the same way that Batten 
does. He uses standard IBM (or Hollerith) cards with 
800 positions as feature cards, although he points out 
that two extra rows on these cards actually allow for 
960 punchable positions. Since Philips also uses ma- 
chine-selected IBM cards, Westendorp is in a favorable 
position to contrast the manual and the mechanical sys- 
tems or simultaneous versus sequential scanning over 
and above the matter of expense of initial outlay for 
equipment. He is particularly impressed by the possib- 
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ility of adding as many feature cards as are needed to 
the pack, whereas IBM and other item cards place def- 
inite limits on the addition of subjects to the code. On 
the other hand, Westendorp uses his machine- selected 
IBM cards for searching for several ideas at the same 
time, something that the feature cards do not permit in 
their present stage of development. Westendorp states 
that it is his experience that IBM cards wear out under 
machine sorting more quickly than do feature cards. 
Finally, Westendorp points out the inevitable disadvan- 
tages of feature cards, that a new series must be started 
when all positions on one are filled, whereas one can 
add item cards indefinitely and in a single series to a 
machine selected IBM system. 

Delta cards are produced by Semper Avanti, 
Losduinenweg 507, The Hague (10). The great advant- 
age of the Delta card is that it provides more punching 
points (10, 000) on the area provided than other feature 
cards of comparable size except those which use pin- 
point perforations (e. g. , the U. S. Office of Basic In- 
strumentation). This is made possible 1 . by the rhom- 
boid design of the squares and 2. by the arrangement 
to punch not in the preprinted areas but on the inter- 
secting lines. The punching machine has a directive 
needle to facilitate this work, but the punching device 
is hardly so refined as to be called a precision instru- 
ment. The reading of the card is made easier by di- 
viding the card into 100 compartments (with 100 inter- 
sections in each compartment) instead of 100 rows on 
both x and y, printing the fifth line (both horizontal and 
vertical) in heavier ruling, placing dots at intersections 
of the third and eighth vertical lines, and the placement 
of odd vertical lines at a slightly higher position than 
the even ones. The card measures 32 cm x 23. 5 cm. 
For these reasons Loosjes (43, 44, 45) argues for the 
superiority of the Delta card over other feature cards. 
Loosjes advocates the use of Synoptic filing (cf. 41, 51) 
as a convenient device for aiding in maintenance of ar- 
rangement of feature cards (45), since the signals in 
this system allow refiling without studying the headings. 
He thinks the use of Uniterm is desirable for the head- 
ings of the feature cards, since it enables the searcher 
readily to select the cards he wants merely by noting 
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what numbers they have in common (45). The punched 
positions could correspond to the numbers written out 
on the Uniterm card. Thus the possibility of error in 
visual comparison of numbers on Uniterm cards is elim- 
inated, and it is unnecessary to compare Uniterm cards 
in several steps and to record all the common members 
on a blank card at each step. 

Loosjes has a deeply rooted faith in the utility of 
feature cards; and, aside from their technical virtues, 
he places special emphasis on their ability to reflect 
the dynamics of scientific research (43, 45). As re- 
search uncovers new ideas and new viewpoints, approp- 
riate feature cards may be added to a set; but in the 
edge-notched cards, the number of subjects that may be 
handled is limited unless one wishes to work with mul- 
tiple sets. 



9. Sweden 

9. In Scandinavia Carl Bjorkbom has reviewed 
briefly the literature of feature cards (5), and the Sys- 
tems Division of Esselte (Bryggargatan 17, Stockholm) 
has developed and exploited them under the name of 
"Findex" (23, 24). There are two Findex cards, one 
with 7, 000 positions, the other with 3, 000 positions. 
The Swedish cards offer no different ideas or new ap- 
plications. The main inspiration comes from the 
Cordonnier cards (Selecto), and the firm's organ even 
contains an article citing an example obviously taken 
from the experience of the Institut des Fruits et Ag- 
rumes Coloniaux (24). The cards show some features 
of the Delta cards, with heavy guide lines and with di- 
vision into squares of a hundred positions in the 3, 000 
position card. However, the cards are punched in the 
squares rather than at the intersections of the lines. 
Some of the German feature cards (e. g. , Ekaha and 
Aliform, infra) are also said to be used in Scandinavia, 
but specific information on their applications is not 
available. 
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10. England 

10. In England and the United States some definite 
practical advances and, in the latter country, imagina- 
tive applications of feature cards have been made. It 
is of some significance to note that recorded uses of 
feature cards in the United States are by federal govern- 
ment or federally subsidized agencies. In England, the 
Netherlands, Sweden, and West Germany, they have 
been developed commercially. 

In England feature cards are known to have been 
produced or promoted by three firms: 1. Carter- Par ratt, 
Ltd. , Iddesleigh House, Caxton Street, London, S. W. 1, 
produces the Brisch-Vistem card, 15 x 28 cm., with 

1, 000 positions, "like Ekaha [inf raj but smaller cards, 
with good layout facilitating quick location of positions" 
(54); 2. Industrial Studies and Investigations, formerly 
located at 40 A High Street^ Hampstead, London, N.W. 
8 (27), but present address unknown, and particulars 
on their cards lacking; and 3. J. L. Jolley and Part- 
ners, Ltd., New Road, Great Missenden, Bucks, with 
an American office under the name of Brisch, Inc., 
1070 Union Commerce Bldg., Cleveland, Jolley and 
Partners are successors to a British firm known as 
Brisch Indexing, Ltd. ; and Carter- Par ratt makes and 
markets the Brisch-Vistem cards. 

The Carter- Parratt cards range from 1, 000 to 

2, 500 positions, although this firm is willing to make 
cards with larger capacity and smaller holes, making 
allowance for the disadvantages of the larger cards 
(communication of J. L. Jolley in writer 1 s file). At 
present, the Jolley-Brisch group puts strong emphasis 
on the use of the Carter- Parratt cards for personnel 
records (7a, 34), but there is also attention to other 
possible applications (34, 36) and to the theoretical 
structure of feature card systems (35, 36). The Carter - 
Parratt feature cards are recommended for special li- 
braries (7a, 33) and for operational research, medical 
research (cf. 1, 11, 12, 13, 14, 15 and discussion, 
infra), hospital records, market research,, photographic 
print and negative indexing, social surveys, property 
records, fault recording and correlation, and criminal 
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records (7a). 

The literature available on Carter- Parratt cards 
and their applications is exclusively from the manufact- 
urers and promoters and is thus based on successful 
applications. 

The following claims are made in the promotional 
folder (7a): 

High speed of operation. Information is ob- 
tained within minutes. For example, all person- 
nel possessing (say) four required characteristics 
can be selected from 1, 000 individuals in about 
ten seconds. 

Constant control of data. The user has abso- 
lute control over the data at all times because the 
unique sorting process requires the cards to be 
absent from their place in the record for a few 
seconds only. 

Simplicity, After initial punching, all possible 
sums and correlations are immediately available. 

Economy. The initial cost is reasonable, and 
maintenance costs are negligible. The character- 
istic cards can be used almost indefinitely. 

Compactness. Up to 1, 000 elements of infor- 
mation can be punched on one characteristics card, 
and 1 > 586 such cards can be filed visibly in a unit 
33 Tr x 24". 

Versatility. The system caters for both chang- 
ing and static data and can therefore be applied to 
a wide variety of records. 

This summary of the virtues of feature cards 
clearly applies to pertinent situations that the promoters 
have found in their work with specialized classifications. 

Jolley has made certain comparisons of the feature 
card with the item card which are rather unfavorable to 
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the latter (33). He re- emphasizes the basic fault of a 
system involving item cards, that it is impossible to 
add a new feature after the feature field is full without 
also adding a complete new set of item cards. While 
the converse is true of feature cards, there are likely 
to be fewer features than items. Classification of fea- 
tures is more important than classification of items, 
and an index can take care of the latter. If items are 
added seriatim, clusters of holes of particular parts of 
a feature card may be evidence of correlation. Random 
distribution of features is also clear to the eye, and it 
will yield evidence of correlation. If a significant num- 
ber of items show through holes when two cards are 
superimposed, there is evidence of correlation between 
the two features. 

Jolley has tried to bring out the versatility, econ- 
omy, and simplicity of his use of the Carter-Parratt 
cards, and he has made a convincing case on the basis 
of the experience of his firm. 

Holmstrom, who is working with Jolley and Part- 
ners on technical improvement of feature cards, states 
in correspondence with the writer that he has invented 
an apparatus to overcome the two main disadvantages 
of feature cards, viz. , 1. the time required for putting 
feature cards back in a file after removing them for 
search, and 2. the fact that the searching operation 
must be repeated on a separate set of cards each time 
the capacity of one set is exceeded. Details of his in- 
vention are not available, although he has deposited pro- 
visional specifications both with Jolley and with the 
British Patent Office. 



11. The United States 

11. In the United States two federal government 
agencies, the Atomic Energy Commission and the Office 
of Basic Instrumentation of the National Bureau of 
Standards, have made effective and imaginative use of 
feature cards. A commercial firm, Documentation In- 
corporated, Washington, D. C., has made important 
contributions to the mechanization of feature cards with 
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the aid of federal subsidies. 

The Technical Information Service of the Atomic 
Energy Commission has developed a set of feature cards 
to ascertain easily and quickly which nuclides possess 
specified combinations of properties (57, 58). IBM 
cards were used on account of the relatively large num- 
ber of punching positions (800) on each card and be- 
cause they can be punched and collated for distribution 
with automatic machinery. The complete index includes 
two sets of cards, each in a different color and each 
indexing about half of the total population of nuclides 
(some 1, 200 in all). Each nuclide is assigned a punch- 
ing position, and there are cards for each property. 
When the cards are superimposed, the through holes 
identify nuclides which have the properties represented 
by the cards at hand. Since there are only 200-300 
properties, the number of cards needed is less than 
half the number needed for an edge-notched system pro- 
posed for the same purpose in ORNL 883 (1951), since 
an edge-notched system would have required a card for 
each nuclide. Three columns not assigned to nuclides 
are conveniently used for collating the cards for distrib- 
ution. A special advantage of feature cards in this 
situation is that it is unnecessary to refer to an index 
to identify a punched position with a nuclide: Element 
symbols are printed above each row (but below row one, 
on either side of the spaces assigned to the element) 
and the number of the column, which is its neutron 
number (indicated by numbers across the top and bottom 
of the card). There is a special device to aid in the 
refiling procedure. The group of cards behind each 
guide card is notched at a different point along the top 
edge of each card, and misfiling is readily noted by 
any break in the groove formed by the notches. It is 
claimed that this set of cards is cheap to use and pro- 
vides for unlimited expansion. When new data are 
available, pertinent new property cards can be issued 
and old ones destroyed. New property cards can be 
added at wilL Any kind of information about nuclides 
can be indexed. The sets are produced in quantity and 
distributed to interested research points. 

The National Bureau of Standards 1 Office of Basic 
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Instrumentation, of which the chief is William A. Wild- 
hack, was set up with the purpose, inter alia, of pro- 
viding U. S. government laboratories and scientists work- 
ing on government contracts "with more complete ac- 
cess to existing information on measuring instruments, 
controls, and data- handling devices" (31). Wildhack, 
working with Joshua Stern of this office, has made sub- 
stantial progress in the use and development of possi- 
bilities of feature cards. Their objective has been to 
minimize problems of indexing, storing, and searching 
information. A basic problem lies in the fact that in- 
strumentation literature is actually a part of many other 
fields and is not organized anywhere as a separate 
field (60). Hence the Office of Basic Instrumentation 
was compelled to develop its own system for collection, 
organization, and retrieval of pertinent references. One 
important consideration in indexing the literature of a 
heterogenous field in which no one is an authority is to 
make sure that a common language is used by the doc- 
ument analyst and subsequent searchers, and the Office 
of Basic Instrumentation believes it achieved this object- 
ive by setting up ten basic categories, representing ma- 
jor points of view of potential searchers (60, 61). Un- 
der each category there is a varying number of primary 
terms, from ten to 300, and reference terms (synonyms). 
The feature cards are filed by category and alphabeti- 
cally within each category . 

The Office of Basic Instrumentation found that it 
had some 10,000-15,000 references to index every year. 
For this purpose it developed 5" x 8 t! vinylite cards 
with 180 columns in 100 rows to provide for 18,000 
references. Thus one series of cards will last for at 
least a year under present publication conditions; and 
when its punchable positions are exhausted, a new series 
must be initiated. The holes are necessarily quite 
small: 0. 025 TT in diameter, and spaced approximately 
0. 040" on centers. Since no American firm manufact- 
ures feature cards as such with perforators and readout 
devices, the Office of Basic Instrumentation had to de- 
velop its own equipment (61). 

The level of retrieval of information is a problem 
of a large proportion of indexing systems, since rela- 
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tively few actually give the information wantecL The 
Office of Basic Instrumentation planned to provide ab- 
stracts and consider the retrieval process complete 
when they were delivered (61). If the abstract were 
available simultaneously with identification of the ref- 
erence, then much time could be saved by the search- 
ers. Looking toward this end, the Office of Basic In- 
strumentation has developed microcite on an exper- 
imental basis (47, 61). Abstracts are photographed in 
the appropriate reduction on a matrix film, exactly the 
size of the feature cards, and the photograph is located 
in precisely the same area covered by the hole punched 
for the document which it describes. The full area of 
the film may be used, since the exposures, unlike the 
holes, require no supporting area between them. The 
abstract is read through a microscope or magnifier. 
The experimental model is based on a card with 1, 000 
positions, and 3" x 5 n typed slips are photographed in 
a 30 to 1 reduction. If no higher filming ratios are 
used, extension of the microcite principle to the main 
sets will require twelve film matrices for each set. 
However, another suggested development (61) might per- 
mit the use of the holes simply to locate the corre- 
sponding microphotographed area, not to illuminate the 
area. Microcite is essentially a return to the primitive 
efforts of Taylor (56) and the mineralogists (16, 17, 18, 
21, 28) to eliminate the role of the punched position as 
an intermediary to get at the needed information and to 
provide it directly. 

Wildhack, Stern, and Smith (61) point out the ver- 
satility of feature cards, a quality which Jolley (33) and 
others have emphasized. At the Office of Basic Instru- 
mentation it is believed that it may be useful for search- 
ers to know which references correspond to the coinci- 
dence of a given number of terms that describe the 
question and which references include a further term 
not so definitely implied by the question. Again, man- 
ipulation of the cards may allow statistical analysis of 
the literature. A mere count of the holes on a given 
card (e. g. , electromagnetic flowmeters) will give some 
notion of the extent to which the subject is handled. The 
number of these that deal with the theory of operation 
of these devices may be ascertained merely by counting 
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the through holes when the theory card is put over the 
first card. 

While the Office of Basic Instrumentation's rather 
simple mechanical procedures will not be reviewed in 
detail here, it is of interest to note that they report no 
trouble in adequate alignment of their very small holes. 
The cards may expand and contract with humidity chang- 
es, but "the differential change with humidity from card 
to card will be much less than the change itself TT (61). 

Working under contracts with the Armed Forces 
Technical Information Agency (ASTIA) and the National 
Science Foundation^ Taube and his colleagues (55, V. 2 
and 4) have attempted to mechanize and expand the fea- 
ture card principle in order to handle large collections 
of documents mechanically. Their work is based on 
theoretical considerations (V. 2, "The Mechanization of 
Coordinate Indexing, TT and V. 4, "Superimposed Coding 
for Data Storage with an Appendix of Dropping Fraction 
Tables"); and a rather crude model of a selection and 
readout machine has also been developed (55, V. 4, 
"The Prototype Mechanical Alpha-Matrex Machine"). 
The ideal of an information retrieval system is simul- 
taneous scanning and instantaneous retrieval. Edge- 
notched and feature cards provide this advantage in a 
limited way, and the Taube group was interested in ex- 
panding the scope and efficiency (defined as a combina- 
tion of cost and effectiveness of retrieval of information) 
of feature cards. 

The Taube group, like nearly all documentalists 
who have worked with feature cards, was concerned 
with the fact that all positions on a given card were 
dedicated to a specific group of documents and that the 
number of positions was limited. Moreover, the pos- 
sible size of the pack of feature cards and its per- 
manence (i e , the necessity of adding new sheets for new 
positions) were seen as obstacles to mechanization. In 
order to overcome these problems, the Taube group 
proposed (55, V. 2) the creation of artificial alphabets. 
Feature cards were to be dedicated to letters of these 
artificial alphabets instead of to vocabulary terms. Thus, 
if we have 260 letters or ten alphabets, we can express 
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any word up to ten letters in any Roman alphabet lang- 
uage. Accordingly, all items on accuracy, airplane, 
acid, accelerometers, etc. , will be posted on the Al 
sheet; all on acid, accuracy, accelerometers, etc. , on 
the C2 sheet; all on accuracy and accelerometers on 
the C3 sheet; all on airplane and accelerometer L5 
.sheet, and so on. Thus the number of feature cards is 
reduced, space on them is used more economically 
(about a third of all positions), and the size of the file 
is stabilized. 

While this as yet untested device will provide for 
more economical use of feature cards, it increases the 
possibility that two or more entries will overlap. Ac- 
cordingly, we have the element of "superimposition 
noise, T1 in which unwanted intelligence may be delivered 
along with the wanted. Thus, in designing a mechanical 
selection device based on superimposition, the Taube 
group had to determine the dropping fraction (i. e. , that 
portion of the whole memory field yielded by a random 
search of the field) most appropriate to a ^particular 
type of coding field and size of collection. To ascertain 
the average number of false drops per search, we mul- 
tiply the number of items in a collection by the drop- 
ping fraction. To this end the Taube group has pre- 
sented special and general formulas for dropping frac- 
tions and tables for the different sizes of coding fields 
(55, V. 4). With these tables, they claim, it is pos- 
sible to select a coding field large enough, and a de- 
gree of superimposition so controlled that the dropping 
fraction is acceptable. 

To implement these ideas the group developed a 
rather primitive type of machine, the Alpha- Matr ex 
(55, V. 4). Cards measuring 16" x 17 TT were devised 
to carry 10, 000 positions generously spaced at ten 
1/16" holes per inch, although a model with a capacity 
of 40,000 positions is also described. Like Cordonnier 
(8, 50) and Wildhack and associates (61), the selection 
of material for the cards (in this case, a specially lam- 
inated plastic material) was a problem requiring much 
study. Other equipment, including the input drill fix- 
ture, the selection mechanism, and the readout fixture, 
were constructed as part of the project. The selection 
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is implemented simply by typing out the search words 
on a seven-row keyboard of twenty-two characters (with 
JK, PQ, and XYZ combined) each; and after the cards 
are selected, they are placed in a readout fixture be- 
hind a translucent plastic raster with 10, 000 squares. 

Since the Alpha-Matrex machine accepts any nota- 
tion that can be made with letters (or, in contemplated 
later models, with numbers), a much larger number of 
vocabulary terms can be stored here than in purely 
manual feature cards, Uniterm cards, or conventional 
catalogs, and there is no need for "satellite catalogs 71 
for authors, sources, projects, etc. All terms are, of 
course, in seven letters, either expanded thereto by 
repetition (e. g. , air = airaira) or contracted (e. g. , 
acceler = accelerometer). No cases of unwanted an- 
swers were reported in service testing. 

The Taube group claims that the Alpha-Matrex 
machine is faster in presentation of answers than any 
other device, including the highest- speed electronic se- 
quential searching devices; that, for its range of capac- 
ity, bit storage costs are well below those of any other 
machine for information retrieval; that it has unusual 
freedom from mechanical failure. It is recognized that 
Alpha-Matrex is only an intermediate device for infor- 
mation retrieval, as contrasted with microcite, East- 
man's experimental minicards, and an IBM card carry- 
ing an abstract. The developers of Alpha-Matrex say 
that the most unsatisfactory aspect of the machine is 
that more intelligence must be applied to the search 
problems to achieve maximum effectiveness. 



12. The Germanies 

12. Although the Germans had no part in the early 
development of feature cards, they have used them 
more extensively than any other national group; and the 
recorded applications of feature cards in the Germanies 
reveal a wide variety of uses. 

Most of the reported instances of the use of fea- 
ture cards are in West Germany. In East Germany fea- 
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ture cards with 3, 500 and 1, 000 positions are produced 
by the VEB Organisationsmittel-Verlag in Leipzig and 
are used in hospitals and factories (communication from 
the Deutsche Bucherei). The only library known to use 
feature cards is that of the Hochschule fiir Elektrotech- 
nik in Ilmenau (Thuringia), but details of this applica- 
tion are not available. A communication from the 
Vsesoiuznaia Gosudarstvennaia Biblioteka Inostrannoi 
Literatury in Moscow indicates that feature cards are 
known and have been studied in the Soviet Union and 
the people 1 s democracies, but so far no library is 
known to have used them. 

Two West German office supply firms produce 
widely used feature cards. Aliform, Brandenburgische 
Strasse 27, Berlin W. 5, advertises three types of 
cards and a simple punching apparatus (2), and there is 
a full account of their practical application in Jaeckle T s 
Wirtschafts- Praxis (32). For comparatively small col- 
lections of documents Aliform offers a card with 2, 000 
positions, divided into twenty squares of 100 positions. 
For larger collections there is a 6, 000 position card. 
In addition, there is a card for chronological indexing 
over a period of five years. Thirty-one positions (the 
maximum number of days in a month) are provided in 
each vertical column. The twelve vertical columns to- 
tal 372 positions for one year, 1, 860 for five. Thus 
not only correspondence but also daily collections of 
data can be indexed by subject. At the top of all All- 
forin cards there are twenty-five positions for the let- 
ters of the alphabet (xy being one unit) to be notched to 
show at a glance the alphabetical position of the partic- 
ular card; and on the last fifth of the upper edge there 
are six numbered positions to be notched to show the 
series. Thus any misfiled card can readily be noted. 
Aliform cards are marketed in the Netherlands under 
the name of Transelecta (45). 

Edler und Krische, Kestnerstrasse 42, Hannover, 
offer Ekaha feature cards and a simple punching appa- 
ratus (2). The Ekaha cards have 7, 000 positions and 
are twice as large as the 6, 000 position Aliform cards, 
with correspondingly larger squares for punching. The 
cards come in several colors for different series. 
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Braband (7) describes the special applications (although 
not referring to Ekaha cards by name) and places par- 
ticular emphasis on their use in searching paten litera- 
ture 

Edler und Krische have printed a somewhat differ- 
ent type of card that is used by the Deutsches Kunststoff- 
Institut, Schlossgartenstrasse 6 R, Darmstadt (39, 40). 
The cards have 6, 000 positions and are divided into six- 
ty squares of 100 positions with numerals from to 59 
lightly printed over them. The intersections of the 
lines, not the squares, as in standard Ekaha cards, 
are punched. Like the Aliform cards, this type has an 
alphabet at the top to be notched as a filing aid. Sim- 
ilarly, a group of numbers from 1 to 10 on either side 
of the card could serve as identification of the series 
to which the card belongs. The Kunststoff-Institut main- 
tains an abstract file with a number corresponding to 
the documents numerical position on the feature card, 
and there is also an alphabetical author index. Knappe 
states that the classification by which the feature cards 
are arranged is being constantly refined (39), and his 
problem is in general rather similar to Batten T s (4) in 
this respect. As a special virtue of feature cards as 
used in the Kunststoff-Institut, Knappe (40) says that the 
time required for searching does not increase in pro- 
portion to the increase in the number of documents, 
while this factor remains constant in the case of edge- 
notched and machine sorted cards. 

German scientists have put feature cards to effect- 
ive use in a wide variety of situations. Martin Scheele, 
a limnologist, devoted a substantial part of his study of 
punched cards to the Selecto system, using his own field 
as an example (48). He cites as a special quality of 
feature cards the fact that additional documents can be 
added at any time and subjected to the full range of sub- 
ject analysis already at hand, or to new subjects, if 
necessary. He recommended Hollerith (IBM) or Powers 
cards (with 960 and 540 positions respectively) for lim- 
ited collections, and he points out that the Powers cards 
are especially handy due to the availability of an inex- 
pensive hand punch a Karl Eduard Rotschuh, a Munich 
physiologist, has followed the same line and emphasized 
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the utility of feature cards for reprint collections (47a). 
He developed his own cards for the 3, 700 reprints al- 
ready in his collection and the some 3, 000 more he ex- 
pects to acquire. He divided his cards into squares of 
100 positions assigned to various letters of the alphabet 
according to their frequency as initials of the last 
names of authors already in his collection. He assigned 
350 catchwords to feature cards and arranged them ac- 
cording to a simple home-made classification. Kister- 
mann and Uhlein (37, 38) recommend feature cards for 
essentially the same type of collection of references. 
They recommend edge-notching of the feature cards as 
a convenient selection aid by which the desired cards 
may be pulled from the pack (37; also recommended by 
Holmstrom, 22). 

The Germans have used feature cards widely in 
various aspects of medical research and medical rec- 
ords. Scheele placed heavy emphasis on this applica- 
tion. Adler (1) applied Aliform cards to pharmaceutical 
research. Udo Derbolowsky, a Hamburg physician, has 
written extensively on feature cards in medical research 
(11, 12, 13, 14). In an earlier article (15) he had 
recommended machine searching for patent records, rec- 
ognizing clearly the relatively high cost of this method; 
but later he showed how essentially the same informa- 
tion could be retrieved by use of Aliform feature cards 
(13, 14). At the same time he demonstrates the docu- 
mentary applications of feature cards (13, 14). He 
recognizes the superior utility of Hollerith (IBM) cards 
for tabulation and correlation, but he advocates the use 
of Selecto cards for limited literature searches (12). In 
a chiropractic study of pelvic movements he recom- 
mends Selecto cards as a medium for recording obser- 
vations, but he goes into no details on the specific ap- 
plication (11). 

While the Germans have demonstrated the large 
variety of uses of feature cards, only Heinze (25a) has 
done any imaginative, original work to refine and ex- 
pand the applications of feature cards, and even he has 
limited his work to indexing journal articles. On a 
mathematical basis Heinze demonstrates the need for 
cutting down on clerical work in compiling references 
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and at the same time recording all subjects of potential 
interest on the appropriate cards. He is particularly 
eager to eliminate the copying of exact bibliographical 
references in full and the waste of space once the max- 
imum number of references has been attained on a fea- 
ture card. He overcomes both of these handicaps by 
punching positions that correspond to page numbers of 
articles instead of giving each reference a serial num- 
ber and punching the corresponding hole. He allots 

1, 000 positions each year to each feature card for this 
purpose, and he writes the abbreviation of the period- 
ical in question beside each perforation (preceded by 1, 

2, etc. if the page number is greater than 1, 000, 2, 000, 
etc. ). If, by chance, two or more articles on the 
same subject appeared in different journals in the same 
year with the same page number, an auxiliary card of 

a different color could be used. Incidental advantages 
of this device are to show the volume of relevant lit- 
erature published in any one year and the possibility of 
studying obsolescence of references by checking or ring- 
ing the holes with a different color of ink representing 
the year in which they were consulted. 

Heinze promises a subsequent article in which he 
will examine the possibilities for photo-electric select- 
ion of feature cards. Here is a possible approach to 
the solution of the search problem which the Taube 
group views as the most serious barrier to maximum 
effectiveness in the use of feature cards (55, V. 4). 



2. Horace Taylor's Patent 

2. Taylors "selective device" (56) is claimed to 
identify "a particular species or a particular number of 
a particular species. . . with ease and accuracy. " His 
illustrations in the original patent show an adequate 
sample of New England birds. There is no evidence 
that this system is generally effective for ornithological 
taxonomy. This fallacy is implicit in many of the es- 
says on specific applications of feature cards and will 
not be recited in detail, although the critical reader 
should bear in mind that much of the literature on fea- 
ture cards is aimed at individual applications. Only 
the Taube group has given appropriate attention to the 
dropping fraction (55, V. 4) and the problem of un- 
wanted intelligence that may be yielded along with the 
wanted. 

Taylor says his cards are "so arranged" that the 
screen sheets may be turned to four positions (recto, 
top and bottom; verso, top and bottom) to be applied to 
four different base sheets, but his cautious patentese 
style refrains from disclosing the secret of this impor- 
tant trick The method of his quadriform arrangement 
should be disclosed; and, if generally applicable to fea- 
ture card systems, it should be refined. 



3. H. E. Soper T s Patent 

3. Soper (52) gives examples of classification of 
individuals, for instance, by occupational class or 
health grade, as examples of the compilation of "tabu- 
lar and statistical data" on Ms feature cards. His 
broad claim of superiority "over systems of indexing 
employing cross references or multiple entry" is not 
supported. Subsequent publications provide evidence in 
favor of this idea in specific situations, but no one has 
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established the general validity of Soper's claim. This 
matter should be a key point in consideration of specific 
new applications of feature cards. 



5. Feature Cards 
for Mineral Identification 

5. Gray (25), Donnay (16, 11), and Fairbanks (21) 
do not argue for the infallibility of their feature cards, 
but they say that they are superior to conventional ta- 
bles for mineral identification. In each case they cite 
examples of the inadequacies of tables, although all but 
one article (17) is conservative in claims for the infal- 
libility of feature cards in mineral identification. 

A much more fundamental problem is involved in 
Hurlbufs claim for the superiority of edge-notched 
cards in mineral identification over the feature cards 
developed by Donnay. Hurlbut ! s argument that one 
must thumb through a whole pack of feature cards to 
find the properties for which one is searching is not 
supported. Batten (4), Cordonnier (8), the Institut des 
Fruits et Agrumes Coloniaux (29, 30), and others have 
proven that adequate classification obviates the need for 
unsystematic search for needed feature cards oi in a large 
group. Holmstrom (22) and Kistermann and Uhlein (37) 
suggest edge-notching of feature cards for preliminary 
selection for searching cards in a large group. Donnay 
(16) separated his relatively small group of feature 
cards with buff tabs. The Synoptic system of tabs de- 
veloped and marketed by B. Lampel (41, 51) is another 
aid to selection of feature cards for searching and was 
advocated by Loosjes in conjunction with Uniterm (43, 
44, 45). Systematic notching to form a regular pattern 
when the cards are filed correctly was advocated by 
Wachtel (58). Hurlbut f s statement that his cards can 
be thrown back together in any arrangement after use 
is significant. This is a basic advantage of edge-notched 
cards. If the yet undeveloped suggestions of photo-elec- 
tric selection from feature cards (4, 25a, 27), find a 
practical solution, the general application of Hurlbut T s 
statement will be invalidated. In any event, we need 
comparative time studies of the use of feature cards and 
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of other systems comparable in simplicity, scope, and 
function. 



6 8 W. E. Batten's Feature Cards 

6. Batten's statement that no mechanical aid will 
make up for a defective classification system is borne 
out in his particular case by the detailed account of his 
classification of the patent literature of plastics (4) and 
the note in the KLD. Manual (22, 744.432 E 1). How- 
ever, this statement needs to be modified by further in- 
vestigation of Loosjes* proposals for the use of Uniterm 
in connection with feature cards (43, 44, 45), by the 
theoretical work of the Taube group (55, V. 2 and 4), 
and possibly also by as yet unpublished ideas of Holm- 
strom (correspondence with writer) and Heinze (25a). 

Holmstrom (22) does not develop his suggestions 
for the "facets" in Ranganathan l s Colon Classification 
in connection with feature cards, but it seems to be 
worth further study. 

Batten's doubts about the wider application of his 
system may have stemmed from concentration on his 
own problems. Literature which will be discussed later 
(e. g. , 8, 25a, 45, 54, 55) will indicate far wider appli- 
cations than Batten envisioned. 

Batten f s statement that technical man-hours are 
saved but not clerical man-hours is obvious. However, 
we need to examine the cost elements in each document- 
ation project to ascertain the economical limits of the 
expenditure of clerical man-hours in the use of feature 
cards. 

If the number of searchable features is unlimited 
(27), we need evidence to point out the economic lim- 
itations of the number of feature cards in a single ser- 
ies (cf. 55, V. 2). It is possible that there is no limit 
but it would be hazardous to plan such a file without 
some consideration of the facility with which it can be 
handled. 
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Batten T s passing reference to feature strips may 
deserve further study Would it prove to be an im- 
provement or an onerous complication to a mechanized 
selection system? 



7. G. Cordonnier and Developments in France 

7. The 12, 500-position Cordonnier card is said to 
be large enough to allow the introduction of new docu- 
ments by leaving a certain number of positions empty 
(22). This statement is made with reference to collect- 
ions not likely to exceed 12, 500 items. The constant 
accretion of documents is as serious a problem for fea- 
ture card systems in general as the expansion of sub- 
ject content is for edge -notched cards. 

There is no statement in the literature about the 
actual steps needed to protect feature cards from expan- 
sion and contraction. The formula for Selecto plastic 
sheets and cellulose cards is not given. Other possible 
dangers, e. g. , mold and insects in the tropical stations 
of agencies such as the Institut des Fruits et Agrumes 
Coloniaux, are not mentioned. This writer has seen a 
shipment of 100, 000 all-rag catalog cards reduced to 
dust by the polillas of Martinique after a week in the 
custom house. 

The effectiveness of the various applications of 
Cordonnier T s cards is mentioned (22), but no compara- 
tive studies contrasting feature cards with other systems 
of selection, have actually been conducted. The Taube 
group outlines the elements of such a study (55, V. 4). 

There must be some awkward peculiarity about the 
French judicial identification system to require the use 
of feature cards for fingerprint identification. A com- 
munication from the Identification Division of the Federal 
Bureau of Investigation in the writer's file advises that 
this agency has tried punched cards (but not feature 
cards) for fingerprint searching and found this method 
impractical. If feature cards are practical for finger- 
print searching, this application needs to be studied in 
detail. 
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The Institut des Fruits et Agrumes Coloniaux (30) 
believes that its colonial branches can depend on the 
feature cards for selecting references they need. No 
evidence other than the continued use of the system is 
available. Holmstrom (22) says it saves the expenses 
and delay of compiling indexes to several different bib- 
liographical organs. It would be well to know the 
names of these organs as a case study in what might 
be clumsy bibliographical organization. 

In connection with Holmstrom 1 s list of apparent 
objections (22), it would be well to document some of 
his refutations with comparative cost studies of different 
selection systems under the same conditions. 

The suggestion of edge-notching feature cards for 
selection from the file (22, 37) needs to be developed 
and described in detail. 

Holmstrom (22) says a complete search of any 
general subject can be made. He proves this statement 
by showing how an effective classification, indexed prop- 
erly by the arbitrary serial numbers of the cards, can 
theoretically provide all possible subjects to be searched. 



8. The Netherlands 

8. Westendorp (59) brings out the contrasts be- 
tween optical and machine searching (or simultaneous 
and sequential searching) of the same type of card and 
says that machine searching permits the search for sev- 
eral subjects simultaneously. It is apparent that this 
need rises from his specific problems (probably of com- 
putation and tabulation) at Philips, although he does not 
say so specifically. A broader comparison between the 
two systems in situations where both might be used 
would be useful. Westendorp also states that machine 
sorted cards wear out more rapidly than do feature 
cards. We need a study of the life expectancy of cards 
under both systems, and suitably durable material 
should be developed to avoid replacement. It is possible 
that cards developed by Cordonnier (8, 50), the Office 
of Basic Instrumentation (61), and Taube and his col- 
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leagues (55, V. 4, "The Prototype Mechanical Alpha- 
Matrex Machine") should be tested for durability under 
manual and mechanical conditions. 

Loosjes (43, 44, 45) makes strong claims for the 
superior "readability" of Delta cards. Since he has al- 
so used Sphinxo cards, these claims are presumably 
based on his personal experience. Loosjes 1 claims 
suggest the need of a study of the rapidity and ease 
with which different varieties of feature cards may be 
read. 

In only one reference (45) does Loosjes expand on 
his advocacy (43, 44, 45) of the use of Uniterm and the 
Synoptic filing system in connection with the use of fea- 
ture cards. He provides no more evidence for these 
contentions than what is quoted in this text; and his 
recommendations for the use of Uniterm must be sup- 
plemented by reference to the work of Taube and his 
associates (55, V. 2, "The Mechanization of Coordinate 
Indexing, " and V. 4, "The Prototype Mechanical Alpha- 
Matrex Machine"). 

If feature cards record the dynamics of scientific 
research more effectively than other systems (43, 45), 
we need to know specific types of research of which the 
dynamics need to be expressed. The enthusiasts for 
feature cards see the virtues of this system as applied 
to their own problems, but the comparable utility of fea- 
ture cards, edge-notched cards, linked hole cards, and 
machine selected cards must be closely defined. 



10. England 

10. The claims made in the Carter-Parratt pro- 
motional folder (7a) are supported only in part by ev- 
idence. Holmstrom's communication with the writer 
dealing with his invention may make the claim for high 
speed of operation true in all cases, but it is not true 
at present except for systems involving only one or two 
series of feature cards. Elsewhere (22) Holmstrom 
says several different desks are needed to operate some 
systems with multiple series. It is open to question 
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whether the speed thus obtained by human labor is more 
economical than expensive mechanical equipment when 
we depart from Batten's contention (3, 4) that feature 
cards are for the small operator. There is a tempting 
(and possibly misleading) analogy here between the ar- 
guments between devotees of the abacus and the high- 
speed digital computer. 

If there is a constant control of data, it is at the 
cost of time needed to refile cards (cf. Holmstroin's 
invention, of which we have no details as yet). 

The simplicity and economy of feature cards have 
been brought out by all users. Economy of large-scale 
systems involving two or more series of cards, is sub- 
ject to investigation. 

Compactness is a virtue until the point at which 
features may proliferate rapidly. Conceivably a few 
thousand items might have to be indexed under a dis- 
proportionately large number of features. 

Feature cards are versatile for the applications 
recommended by Jolley and Carter- Par ratt (7a, 33, 34, 
35, 36). Application to a complex body of scholarly 
literature might cancel this claim in some instances. 
It might be well to attempt to apply feature cards to 
some bodies of literature in which their utility will be 
much less than in fields where they have been used 
thus far. A few negative results would help to define 
the limits of their practical utility. 

Jolley (33) offers an extreme case of the compar- 
ison of feature cards and item cards. In his applica- 
tion to personnel records (34), he is dealing with a 
known number of employees (items) in a given agency, 
and features needed to describe personnel for manage- 
ment purposes are not excessively numerous (34). Un- 
iversal application of this argument to all possible uses 
of feature cards is subject to investigation in each 
instance. Jolley T s suggestion of the use of feature 
cards for correlations (34) is significant. He says (34): 
"In research applications this means that no line of in- 
quiry need be followed up unless there is already evi- 
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dence that a correlation of one sort or the other exists 
between the relevant features. TT 



11. The United States 

11. Wachtel's argument (57, 58) for her feature 
cards over edge-notched cards is supported in part by 
the simple arithmetic that the latter require twice as 
many cards in a pack as the former. She does not 
prove convincingly that her edge-notching device for re- 
filing is a substitute for the advantage of edge-notched 
cards that they can be thrown back together at random 
(cf. Hurlbut, 28). 

"Unlimited expansion" of the Atomic Energy Com- 
mission cards needs definition. It is unlikely that any 
substantial number of new properties will be identified. 
Here, as in the case of the mineralogists' cards, we 
have a relatively stable number of features and items; 
and therefore claims made for these two systems are 
not always universally applicable. 

The level of retrieval has been set by the operat- 
ors of the Office of Basic Instrumentation system as the 
provision of abstracts. It is possible that this is satis- 
factory to the laboratory worker in this field, but evi- 
dence is lacking. If microcite and its potentials are 
fully developed, the desirable level of retrieval for each 
case needs to be identified. 

Substantially larger surfaces than 3" x 5 n cards 
have been legibly reproduced on film on areas even 
less than 0. 025 TT in diameter. This possibility needs 
further study with relation to its applicability to the 
microcite principle. Possible relevance of the minicard 
system also seems to justify study. 

Wildhack et aL (61) argue for the versatility of 
feature cards. As examples, they cite the possibility 
of identifying some less important or less obvious fea- 
ture of a question and the statistical analysis of the 
literature. Together with evidence cited by Jolley (33), 
there is good reason to accept the idea of the versatil- 
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ity of feature cards. However, the various aspects of 
this versatility should be identified, and the applicabil- 
ity of feature cards to individual classes of situations 
should be made clear. 

The practical value of the dropping fraction tables 
(55, V. 4) is illustrated by four examples. No large 
scale operations are analyzed, but there is no reason 
to assume that the hypothetical examples are not typical. 

Just as in the case of the special cards made up 
by Cordonnier (8, 50) and the Office of Basic Instru- 
mentation (61) ? it would be desirable to know the pre- 
cise composition of the cards. Taube's group says that 
"fabrication of the cards turned out to be the most crit- 
ical aspect of the entire project. TT 

The claim for the speed of search by the Alpha- 
Matrex machine is supported by actual time studies, 
with tabulations. Comparative studies are not made, 
but an outline for such a comparative study is presented. 
The relative expense of the machine is as self-evident 
as is the relative cost of manual feature card sets and 
machine selection devices. As for the freedom from 
mechanical failure, the inventors suggest that Alpha- 
Matrex may be deceptive of the facts of machine life and 
become more typical as it becomes more complex. As 
for the intermediary character of the Alpha-Matrex, the 
suggestive work on microcite and minicards suggests the 
need and feasibility of serious investigations along these 
lines. 

Jonker Business Machines, Inc. , 404 North Fred- 
erick Avenue, Gaithersburg, Maryland, headed by Fred- 
erick Jonker, a former Taube associate, has been work- 
ing on practical developments of Alpha-Matrex. Mr. 
Jonker has developed and marketed a "Termatrex" ma- 
chine to handle cards with a basic capacity of 10, 000 
items of data. He is now working on a system that 
"will search millions in a matter of minutes, TT complete- 
ly automatic push-button equipment, and equipment tie- 
in with computers and IBM systems. He is fully aware 
of the need for direct access to information rather than 
using the Termatrex machine as an intermediate device, 
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and his firm is working on solutions for these problems. 

12. The Germanies 

12. Knappe (40) demonstrates on graphs the re- 
sults of actual time studies made to prove his conten- 
tion that time required for searching a feature card sys- 
tem does not increase in proportion to the increase in 
the number of documents. 

Derbolowsky (12) says "Die Holler ith-Methode ist 
auf medizinischem Gebiet die Methode der Korrelations- 
und Grosszahlenforschung, " without further explanation. 
Jolley (33) has shown that certain types of correlations 
may be readily identified by feature cards, e. g. , clus- 
ters of perforations at one point on a card when docu- 
ments are recorded chronologically, or a significantly 
large number of through holes when two cards are su- 
perimposed 8 This is a useful example of the type of in- 
formation that needs to be developed and critically ex- 
amined in a comparative study of various systems of 
punched cards which are put to the same tasks. 

Heinze (25a) gives no indication of the character 
of his proposed photoelectric device for selecting cards 
for examination. Here is a potentially significant con- 
tribution* The comparative efficiency of this device, of 
Holmstrom T s yet unrevealed notion for simplifying the 
searching of multiple sets, and of the Alpha- Matr ex 
will need to be subjected to detailed comparison in terms 
of expense and effective retrieval of information. 



Major Targets for Research in Feature Cards 

The disarming simplicity of the basic feature card 
principle tempts us to overlook broad elements of their 
efficiency (cost of equipment and operation and effective- 
ness of information retrieval). Only the Taube group 
(55, V. 4) and Heinze (27a) have given proper attention 
to this element. Detailed time and cost studies, start- 
ing with the outlines of the Taube group, must be made 
in a large number of characteristic situations before we 
can accept all the claims made for their efficiency (as 
defined above). In particular we need comparative stud- 
ies with other sequential and instantaneous information 
retrieval devices, both manual and mechanical, and with 
the various types of library catalogs (author, subject, 
dictionary, classed, card, and printed). 

Some of the enthusiasts for feature cards have been 
a bit reckless in their claims for the versatility and 
wide application of the system. The limits of utility of 
feature cards, both for subjects and for methods of in- 
vestigation, need to be closely defined. This may best 
be achieved in connection with studies proposed in the 
preceding paragraph. 

The mechanization of feature cards, actually a- 
chieved by the Taube group and seriously proposed by 
Batten, Holmstrom, and Heinze, is the next major step 
forward. At the same time the search problem must 
be approached in the same imaginative terms as those 
applied by the Taube group and Heinze. Symbolic logic 
as used by the Taube group will carry us only so far. 
Rigid tests of feasibility are indispensable at each stage 
of theoretical or mechanical development. 

The level of retrieval of information is a grave 
problem in all systems. Rider attempted to make it 
absolute by inserting microcards with full texts in library 
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catalogs in lieu of author cards. Where little informa- 
tion is needed (e. g. , the cards on properties of nuclides), 
such a solution is easy. We need to know how acess- 
ible full or partial information must be to workers in 
various fields, how economical it is to provide it, and 
in what form it should be given in different fields. To 
what extent is microcite the answer in a feature card 
system? Can minicards be adapted to a feature card 
system? What photographic reductions are feasible in 
such cases? 

How practical is it to "publish" feature card packs 
such as those on properties of nuclides or those of the 
Institut des Fruits et Agrumes Coloniaux? To what sit- 
uations can such publication" be applied? 

Apparently the Taube group, Cordonnier, and the 
Office of Basic Instrumentation have overcome the 
problems of producing cards to hold pinpoint perfora- 
tions accurately and, in the first case, to stand up un- 
der machine handling. This problem will remain with 
us if we move toward cards with extremely large num- 
bers of positions and more machine handling. It is a 
basic practical problem that will require constant at- 
tention. 
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Volume Four Part Three 
PUNCHED CARDS 

by 

Ralph Blasingame, Jr. 



I. Introduction 

The history of the application of machine- sorted 
punched cards to libraries is a relatively brief one - 
perhaps twenty-five years at most. During that time, 
applications have been made in three main areas: first, 
in the routine, repetitive tasks concerned with circula- 
tion, acquisitions, accounting and the like; second, in 
the somewhat broader and possibly more significant area 
of bibliographic control; and third, in preparing copy for 
standard library tools (lists and catalogs). In the first 
type of application, standard machines have been used. 
In the field of bibliographic control, on the other hand, 
a considerable amount of effort has been expended on 
the development of mechanical punched-card and related 
equipment particularly applicable to the requirements of 
the problems encountered. From these two facts relat- 
ing to equipment development, one may infer that ma- 
chine-sorted card equipment is not especially profitable 
in the first type of application (though perhaps the only 
profit consideration has been on the part of the equip- 
ment manufacturers) and that the second type of applica- 
tion has attracted people of greater aggressiveness and 
with more pressing problems. Both inferences are 
probably correct, and it is interesting to note that the 
use of punched-card equipment in the first type has, 
judging from frequency of mention in the literature, 
reached a standstill, while the punched-card seems to 
be headed for an incidental role, perhaps as a program- 
ming device, in the second type. The third type appears 
to have passed the experimental stage, but has not been 
widely accepted as compared to the first or second. 

The customary aim of the first type of application 
is to free the librarian to work in more important areas, 
normally in public services. For the second, the gen- 
eral justification is to give a measure of control over 
the literature. For the third, to make information on 
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library holdings known over a wide area is the common 
aim. 

Information for this study was gathered through 
searches in Library Literature, Library Science Ab- 
stracts and the "Annotated Bibliography on Uses of 
Punched Cards" in Punched Cards, Their Applications 
to Science and Industry (1958), second edition (1). This 
study is not comprehensive in the sense that every ar- 
ticle located has been mentioned. An effort has been 
made to sort out those items of the literature of the 
application of machine- sorted cards to libraries which 
seem to typify applications in the areas discussed. For 
convenience, IBM and Remington-Rand cards and related 
equipment and cards will be lumped together hereafter 
under the term machine- sorted cards. 



n. A Routines 

As a substitute for manual filing, sorting and rec- 
ord-keeping, machine- sorted card systems date back at 
least as far as 1936 (2). Soon after that, some attention 
was given to machine- sorted cards as a device for re- 
cording library statistics (3). Growth of the application 
of these machines to library routines was slow, perhaps 
because of the depression budgets on which many libra- 
ries existed or perhaps because librarians have general- 
ly been slow to adopt machine methods. By 1943, how- 
ever, at least six applications were known (4); by 1946, 
at least eight were recorded (5); and by 1950, seventeen 
such applications were located (6), Though by far the 
greatest activity in this general area has been in Amer- 
ican libraries, some work has taken place elsewhere, 
especially in England (7). 

A review of the literature of applications of ma- 
chine-sorted cards to library routines is an especially 
unrewarding experience. With a few exceptions, the 
articles one encounters are speculative^ painfully un- 
critical, repetitive, or some combination of the three. 
This is not necessarily a criticism of librarians; the 
seeming magic of punched card machinery has enchanted 
a good many supposedly hard-headed business men. An 
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indication of general use of the machines lies in the 
phenomenal increase in the market price of IBM Corp- 
oration stock. Perhaps another indication is the fact 
that only three articles were found after protracted 
search which reported that a machine- sorted card ap- 
plication had been discontinued. This latter fact may, 
of course, also show that converting to a machine-sort- 
ed method may be quite the same as taking a bear by 
the tail. 

Interest in applying machine- sorted cards to library 
routines, judging from the number of articles located 
for this study, has flagged since the early 1950' s. Of 
the forty-five references examined, only three date 
from 1955 or later, while twelve appeared between 1950 
and 1955 and twelve between 1944 and 1949 Further, 
those published since 1955 do not describe new applica- 
tions. 

Several reviews of machine- sorted card applica- 
tions have appeared, none of which furnishes a complete 
coverage, probably because the literature is scattered 
through state library association bulletins, master T s 
essays, and pamphlets, in addition to the more or less 
standard library journals. Perhaps the most complete 
and up-to-date review has been made by Berry (8). An- 
other of note for its selection of unusual applications 
rather than completeness is by Gull (9). Parker's 
descriptive book on punched-card application is required 
reading for anyone wishing a review of procedures (10). 

The following routines are among those to which 
machine- sorted cards have been applied in some way: 

Circulation 

Acquisitions and Accounting 
Analysis of Book Stock 
Serials Acquisition and Control 
Preparation of Catalogs 
Shelf-listing 

Payroll and Personnel Records 
Billing 

Inventories of Equipment and 
Materials 
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Circulation 

Perhaps because circulation routines present the 
same relatively simple problems time after time, a 
comparatively large number of libraries have applied 
machine-sorted cards to them. The variety of methods 
used may reveal both the flexibility of use of the ma- 
chines and the ingenuity of librarians. Methods here 
may be classified in three general types: first, those 
which make a mechanical record from book and borrow- 
er cards; second, those in which the punched card is 
used as a call slip and is punched for date due (and 
sometimes for other information); and third, those in 
which the card carries only a serial number and which 
are more or less straight- for ward transaction number 
systems. 

Of the first class, the system installed at the 
Montclair (New Jersey) Public Library is certainly the 
most publicized (11). This system originally involved 
a machine which has not been made generally available. 
However, the principle advantages of it may be achieved 
by the use of standard machines through a method sug- 
gested by Callander (12). The comparative expense, 
however, has not been determined or, at least, has not 
been made public. The immediate drawback to the use 
of this system is that it requires that a card be punched 
for each cataloged item in the collection and for each 
borrower. Selected information from the borrower's 
card and the book card are punched automatically into 
a third card (remotely, if desired). Upon return of the 
material, a fourth card is punched and the TT out TT and 
tT in tT cards then constitute the record of each loan. 

A description of the second type of circulation sys- 
tem may be found for the University of Florida (13). 
Related systems are in use in the libraries of the Un- 
iversity of Missouri, the University of Wisconsin, and 
elsewhere. By and large, these systems represent an 
effort to incorporate into one card file information as 
to the location of materials charged out and due date 
records. A third kind of record can be included in the 
single file; namely, a record of the identity of the bor- 
rower. A variation of this type of system is used by 
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the Library of Congress in locating talking book ma- 
chines (14). 

Variations on the third class of machine- sorted 
card circulation systems may be found in the Brooklyn 
College Library (15), the Detroit Public Library (16) 
and the Stockton (California) Public Library (17). The 
principle involved is the same, but in the first case 
the primary record is a call slip, maintained in numer- 
ical sequence and in the others it is a filmed record of 
the transaction. In both types, one of the most import- 
ant advantages is that the transaction cards retrieved 
from returned books may be sorted mechanically into 
sequence and missing numbers may then be located by 
matching the returned cards with a perfect deck. 

The advantages sought by the application of 
punched-cards to circulation are speed, accuracy, low- 
ered costs and rapid turnover of stock. All of these 
gains may have been realized in some libraries: ser- 
iously lacking from the literature, however, is evidence 
that punched- card methods are more efficient than (or 
even as efficient as) carefully planned manual, photo- 
graphic or sound-recording methods. 

Acquisitions and Accounting 

Applications of machine- sorted cards is acquisi- 
tions and accounting routines represent, for the most 
part, nothing distinctive from ordinary business proce- 
dures adapted to those physical things peculiar to librar- 
ies. A case study has been made of the procedures 
used in the Columbia University Libraries for account- 
ing (18) and somewhat more brief descriptions of other 
systems have been written for systems in the Library 
of Congress (19), the Milwaukee Public Library (20) 
and others. In England, an allied application is in the 
maintenance of the stock record (21). 

At least in theory, the punched card used as an 
order or accounting device could become the shelf-list 
card and/or might be used to select lists of materials 
by subject. A primary advantage of the card is that it 
is reusable; that is, its primary cost may be divided by 
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the number of times it is used. Unfortunately, no good 
description of the actual use of order cards in this man- 
ner could be found. Perhaps the cost advantage is more 
apparent than real when one begins to add the necessary 
information (classification number, for example) to the 
individual card. 

There are other theoretical advantages to the use 
of machine-sorted cards to acquisitions routines. For 
example, sorting an outstanding order file for materials 
ordered but not received can be a time-consuming and 
often frustrating piece of work. Such a job could be 
accomplished easily and automatically if the order file 
were on machine- sorted cards and if each included the 
date of ordering. This and other uses are suggested by 
Parker (22). Exploitation of these possibilities takes 
careful planning and a knowledge of the machines which 
few librarians have. Furthermore, where machine ac- 
counting is imposed on the library by a higher authority 
as a step in centralization of work (for example, ac- 
counting), considerations of the central office may pre- 
clude design of the cards solely for library purposes. 

Analysis of Book Stock and its Use 

At least in theory, a device which will enable the 
librarian to analyze his collection in terms of subject 
coverage, age of materials, use of materials and other 
related factors should be of great benefit. Maintaining 
a "balanced" collection, replacing or removing out-dated 
items and spotting areas -of heavy or light use are im- 
portant tasks which are at best sporadically accomplished 
in many libraries e Because of the publicity given to an- 
alysis of book stock and use of materials at the Mont- 
clair Public Library (23), the impression has gotten a- 
broad that machine-sorted cards present an easy way of 
accomplishing these tasks. 

From conversations with Miss Quigley, this writer 
believes she would be the first to hold up two cautions: 
first, no information can be retrieved from cards unless 
it was recorded on them; and second, planning the sys- 
tem at Montclair was an involved process and at least 
a measurable amount of time and money has gone into 
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recording data which has been seldom, if ever, used. 
However, at least one study of use of a collection using 
a machine- sorted card method has been carried out in- 
dependently of the actual circulation system in use (24). 
Thus, the data necessary to studies of use may either 
be collected in the process of circulation or by special 
study while retaining the presumed advantages of the 
machine- sorted card. Parker (25) and Wight (26) early 
suggested the use of these cards in circulation studies 
and in the recording and analysis of library statistics 
generally. Parker (27) has further suggested that ma- 
chine-sorted cards may be a useful device in making 
studies of obsolescence of materials, an area of re- 
search in which work is made difficult by the mass of 
data needed for adequate generalization. 

Waugh (28) "presented some statistics which illus- 
trate the kinds of information as to who reads what 
which may be obtained from the Montclair application. 
At the same time, Waugh indicated, perhaps uncon- 
sciously, a danger of that type of analysis if the analyz- 
er is not aware of other important factors; namely, 
there is the possibility that the selection of materials 
might be influenced too much by observed use. That is, 
the poorly served people in a community might become 
even worse off than before if selection followed use. 

Serials Acquisition and Control 

The multiplicity of tasks to be performed for ser- 
ials holdings in reviewing subscription lists, placing of 
orders, recording of receipts, and charging to various 
funds will make even the most tradition-bound humanist 
look for mechanical help. Many types of equipment, of 
course, have been applied to traditional methods and 
some departures in policy have influenced procedures. 

Moffit (29) has written perhaps the most complete 
article descriptive of one system of using machine- 
sorted cards for financial control of serials acquisitions. 
He presented these advantages as a result: increased 
control of prices and lists; simplified review of titles 
subscribed to by subject; and the feasibility of a local 
union list of serials. The Milwaukee Public Library 
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(30) has also used machine- sorted cards in connection 
with its serials records. No description of the last 
named system could be located. Parker (31) has sug- 
gested that financial records, records of receipt of is- 
sues and binding schedules may be maintained for ser- 
ials on machine- sorted cards. 

Preparation of Catalogs 

A close (and quite probably argumentative) dis- 
tinction will be observed here to attempt to discuss the 
use of machine- sorted cards in the routine of catalog 
preparation and maintenance as against their use in lit- 
erature indexing and in the manufacture of lists and cat- 
alogs for wide distribution. The latter two types of 
application will be discussed later. 

One might suspect that the catalog of the library 
of the Department of Education of the International Bus- 
iness Machines Corporation (Endicott, New York) should 
have been placed on machine- sorted cards. It has (32). 
There, a basic card was prepared for each item and 
three other cards were reproduced, forming four files. 
No further description of this application was located. 
Taube (33) has mentioned very briefly the use of ma- 
chine-sorted cards to prepare a subject- authority file 
and to convert that list to book form. 

In this area is found a great rarity; reference to 
an attempt to apply machine- sorted cards which was not 
continued. Challons, in a discussion following a paper 
by Perry and others (34) mentioned briefly an effort to 
prepare a catalog of some 2, 000 textbooks in the Tech- 
nical Library of the Admiralty Signal Establishment. 
An effort was made to prepare Duplimat stencils direct- 
ly from machine- sorted cards using a tabulating machine. 
For reasons not specified, the operation was termed 
technically not feasible. Machine- sorted cards have 
been used to make somewhat unconventional catalogs for 
maps and picture files (35). The Army map service 
produced an inventory record for its map collection. 
Such data as area, scale and language were recorded on 
machine- sorted cards. The resulting file then could be 
used as a catalog, selection device or charge record. 
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Langan's "Film T n File" system involved the insertion 
of micro reproduction of pictures in machine- sorted 
cards and the coding of the subject in the balance of 
the card. Parker (36) suggested the possibility of us- 
ing the cards in catalog production as early as 1938. 
Such applications, at least to produce conventional card 
catalogs, however, either are not feasible or, at least, 
the actual effort has seldom been made. However, Pike 

(37) mentioned briefly the transfer of information from 
machine-sorted cards to catalog cards in a library in 
England. As will be discussed later, application of 
punched cards to production of copy for lists and cata- 
logs in book form is feasible. 

Shelf-Listing 

Any circulation system which requires the punch- 
ing of a card to represent each item in the collection 
provides the basic record from which a shelf list could 
be made. Specific mention, however, of this type of 
application is infrequent. The Milwaukee Public Library 

(38) has done its shelf-listing mechanically and Jones 

(39) has briefly described a system in operation in 
Stockport, England for maintaining accession records 
(presumably this could be a shelf list) on Powers Four 
Cards. The Stockport method does not furnish a com- 
plete record of holdings, though it could be modified to 
do so. A hybrid application, which combines accounting, 
cataloging, and inventorying is the project carried out in 
the Library of Congress for listing and accounting for 
surplus books for veterans at the end of World War n 
(40). A complete shelf list on punched cards could be 
used in several ways: to help determine the value of 
the book stock for insurance purposes; to provide lists 
of new accessions; and to assist in analyzing the book 
collection. 

Payroll and Personnel Records 

Only one reference to the application of machine- 
sorted cards to personnel records and payrolls was lo- 
cated in the search made for this study (41)* However, 
in all likelihood many institutions having centralized per- 
sonnel records surely have applied such methods and 
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Parker (42) has outlined several possible methods. This 
may be considered a "normal" use of the cards which 
may be found in many businesses and will not be stud- 
ied further for present purposes. 

Similarly, billing and inventories of equipment are 
seldom mentioned in library literature and applications 
of machine-sorted cards here probably offer less of val- 
ue than in business and industry. 



II. B Literature Searching 

There is a growing mountain of research. But 
there is increased evidence that we are being 
bogged down today as specialization extends. The 
investigator is staggered by the findings and con- 
clusions of thousands of other workers - conclu- 
sions which he cannot find time to grasp, much 
less to remember, as they appear. . . . 

Professionally, our methods of transmitting and 
reviewing the results of research are generations 
old and by now are totally inadequate for their 
purpose (43). 

This basic problem expressed by Bush has led 
many people to speculate on and to experiment with dif- 
ferent mechanical devices to the end of indexing litera- 
ture so that it might be searched completely and quickly. 
By the time serious speculation on the problem began, 
machine-sorted card equipment was being used in other 
ways. Because of its seeming flexibility and speed, 
such machines must have appeared to provide a ready- 
made solution to the problem. 

Machine- sorted cards have attracted attention from 
both admirers and detractors. On the one hand (44), 
their advantages are described: 

1. No necessity to establish a methodical co- 
herent system of arrangement. It is sufficient if 
individual characteristics are identified with con- 
secutive code numbers according to an alphabetical 
index or other chart. This means that the work 
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of coding can therefore be completely adapted to 
the particular task. 

2. No necessity to keep the punched cards in 
any particular order in the file. The cards which 
are picked out can be put back anywhere in the 
file. 

3. Rapid sorting, even when there are very 
large numbers of cards. 

Fairthorne (45), on the other hand, states that 
"The disadvantages of punched cards are consequences 
of their needing but insensitive and coarse means of 
discrimination, and of being permanent.- The power and 
general cost of handling a large volume of information 
at even moderate speeds can be very large, and in- 
creases as the cube of the speed. . . TT Further, "The 
arithmetic interpretations of punched-hole positions and 
subsequent computational operations are 'not only incon- 
venient, but also unnecessary, in library work. TT 

At least as early as 1936, the possibility of the 
application of machine- sorted cards to literature con- 
trol had occurred to Mayor (46). Since that time, a 
considerable amount of actual experimentation has been 
carried out and a good deal of constructive speculation 
has been published. The literature is scattered through 
library periodicals, scientific journals, house organs, 
books, pamphlets and the increasing number of docu- 
mentation journals. The effort necessary to pull even 
a fair portion of it together is enough to reinforce the 
view that mechanical control of literature is highly de- 
sirable. 

Mechanical literature searching is not universally 
accepted as the ideal method of preparing for an invest- 
igation,, Burchard (47), while demonstrating the vast 
problem which faces the scientists in a literature search, 
makes the point that simplifying the searching process 
may result in the loss of "accidental stimuli, " which 
have sometimes been most productive of ideas. 

In relation to scope of coverage, at least two ma- 
jor types of machine- sorted card applications to litera- 
ture control have gone on simultaneously. On the one 
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hand is the mechanized documentation center, and on the 
other is the relative narrow and restricted application. 
As to methods of using machine-sorted cards, at least 
three major variations have been attempted. In the 
first case, the effort has been to record all information 
pertaining to each piece of literature on one card, us- 
ing trailer cards where necessary; in the second type, 
the "unit- card" concept has been developed in which 
each article, report or item of interest is identified by, 
to use an actual example, raw material, process, pro- 
duct and properties. Four cards are then prepared for 
each bibliographic item or, if necessary, part of a bib- 
liographic item, and maintained in separate files. In 
the third type, experimental data are recorded directly 
on cards. This last, though it can be used as an index 
to scientific investigations, is more a laboratory than 
library application and will not be treated in this study. 
Not all restricted applications will fit into one of these 
three categories. For example, Stoetzer has described 
a system in which machine-sorted cards are "keyed" to 
an existing classed catalog (48). However, the three 
general methods mentioned appear to cover most sys- 
tems of particular significance. 

Luhn (49) has presented a system for sorting ma- 
chine-sorted cards photoelectrically. In brief, the 
scanning machine passes the cards lengthwise under a 
"question card. " Light is directed through the question 
card into a bank of photo cells. When a card in which 
a pattern of holes complementary to that in the question 
card passes beneath the question card, light is shut off 
to the photo cells, and that card is selected. This ma- 
chine could, of course, be applied to either a documen- 
tation center or a limited field. However, no record of 
its actual application was located. 

Documentation Center 

Twenty years ago, Delmas (50) suggested the use 
of machine-sorted cards in a central location. His 
idea, in brief, was that requests for information could 
be directed to the center where partial catalogs or se- 
lected bibliographies would be made using mechanical 
methods. Other similar suggestions have been made, 
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including the sponsorship of such a documentation center 
by a library association (51). The underlying bases for 
the documentation center are the inadequacy of tradition- 
al methods of bibliographic control and the potential ben- 
efits of concentrating time and money. 

A documentation service using machine-sorted 
cards was reported (52) in 1946. In this application, 
the cards are used as a subject index, are filed by 
main subjects and carry bibliographic data on the re- 
verse side. 

Pietsch, who has been concerned with operating a 
documentation center using machine- sorted cards, has 
written an "Evaluation of mechanized documentation" 
(53). He describes the problems faced by the Gmelin 
Institute in publishing the Gmelin handbook and the ap- 
plication of IBM cards and equipment to create a litera- 
ture searching service. Pietsch concludes that "The 
present state of electronic development of automatic doc- 
umentation can be expected to move in the direction of 
electronic storage units" (54). Thus, he feels that ma- 
chine-sorted cards are not the solution, or will be used 
incidentally rather than as a primary storage device. 

Machine-sorted cards, then, once appeared to pro- 
vide a solution to literature searching on a broad scale 
and have been used in a working system in a documen- 
tation center. Their speed in standard equipment, how- 
ever, appears either to be slower than was supposed or 
to be less than is essential for searching large files. 
Shaw (55) has discussed the sorting speed, storage ca- 
pacity and related problems of machine- sorted cards and 
other devices which may be used for bibliographic 
searching. While he states no definite conclusions in 
reference to machine- sorted cards, it is clear that the 
storage capacity of such cards is substantially less than 
the other devices which he examined (56). He also 
points out that the effective speed of a machine- sorted 
card system is substantially lower than the speed of the 
individual machines may indicate (57). 
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Restricted Applications 

Applications of machine- sorted cards to specific 
files drastically restricted in size and scope as com- 
pared with the documentation center concept are num- 
erous. Thorough investigation beyond the literature 
might reveal a sizeable number not yet reported and 
might yield new notions in methods. Evidence of the 
fact that these smaller files are susceptible with some 
degree of success to indexing and searching by machine- 
sorted cards, and that a variety of methods may be 
used, is to be found in the publication resulting from 
the Symposium or Systems for Information Retrieval, 
Western Reserve University, Cleveland, Ohio in 1957 
(58). Eight of those systems will later be described 
briefly as more or less typical of other similar applica- 
tions. 

One account of a limited application which was 
later abandoned was found. Ashthorpe (59), reviewing 
an attempt to cope with rapidly accumulating report lit- 
erature at the Atomic Energy Research Establishment, 
Harwell, England, reported the following disadvantages 
of a machine -sorted card system as reasons for revert- 
ing to an orthodox card index: 

1. Especially where multiple sorts were re- 
quired, the machines were too slow. 

2 Machine sorting did not eliminate enough 
cards. 

3. Searching on a "fine" UDC number some- 
times meant resorting on a broader number. 

4. Wear on cards impaired their usefulness 
and caused delays. 

5. Use of the system was such that the ma- 
chines stood idle much of the time. 

6. No more than one search could be carried 
on at a given time. 

Ashthorpe described the advantages of an orthodox 
card index over the machine system as: 

1. Several searches can be made simultaneous- 
ly- 
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2. "Snap" questions are more easily answered. 

3. The searcher can adopt fresh approaches 
during the search. 

4. Wear and tear on the file is less - or at 
least less critical. 

It is possible that at least some of the problems 
Ashthorpe encountered are a result of using the Uni- 
versal Decimal Classification. 

In each of the cases reported in Information Sys- 
tems in Documentation, an indexing or classification 
system was designed or adapted for the specific field to 
be covered. However, coding is not a subject to be 
discussed in this review and only the methods of using 
machine- sorted cards will be discussed below. 

Single Card-Multi Field Method 

At first sight, machines actuated by punched cards 
appear to operate with great speed. The speed quoted, 
for example, for the IBM Sorter is 21, 000 cards per 
hour. High-speed sorters reach approximately 60, 000 
cards per hour. If only one column must be sorted, 
and if nothing remains to be done after sorting, those 
speeds are correct. However, literature indexing sys- 
tems typically require more than a single column; in- 
deed, they frequently require that several multi-column 
fields be sorted. In addition, cards must be brought to 
the sorter and taken away after the sorting and cards 
sorted out of a file must be refiled. Thus, the effect- 
ive speed of conventional punched-card machinery must 
be calculated separately for each application and will be 
substantially below the advertised speeds of some of the 
machines. 

One type of solution to this problem of speed in 
sorting is represented by a group-selecting device (60). 
At this stage in the development of mechanical docu- 
mentation, such devices, while useful, are rudimentary. 
A more flexible, though undoubtedly more expensive, 
tool is the machine which will identify a code designation 
and select the appropriate cards no matter in what part 
of the cards the code is punched. Garfield (61), in des- 
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cribing the IBM 101 Statistical Punched Card Machine, 
states that the device can accept a fluid coding system 
and that it can make several searches simultaneously, 
thus at least pointing toward an increase in effective 
speed. It is interesting to note that, at the same time, 
Garfield pointed out the advantage of pre-arrangement 
of cards so as to eliminate some parts of a search. 

Unit- Card Approach 

Whaley (62) distinguished between "scanning" sys- 
tems and "collating" systems. In the former, the 
effort, as described above, is to use one card for each 
document and to punch into the card several subjects. 
Then the entire field is sorted for any one of the sub- 
jects or, in the IBM 101, several separate sorts may 
go on at once. The collating system uses, in contrast, 
the unit card approach, and several cards are made 
for each article or item of information and maintained 
in separate files or in an ordered file such that blocks 
of cards represent definite subjects. 

Peakes (63) has described an internal report in- 
dexing system which began as a scanning system but 
which was converted to a collating system after one 
year because: 

Our attempts to search such a file [a scanning 
system] revealed that all the cards of the index 
must be passed through the search machine for 
every inquiry. For the type of question which we 
wished to answer, as this kind of index grew, it 
was placed at an ever -increasing disadvantage as 
to both elapsed time for an inquiry and the rate at 
which inquiries could be answered. Also, we were 
concerned about the possibility of cards wearing 
out and creating a severe card-replacement prob- 
lem. 

In the revised system, each "unit of indexable in- 
formation" is tied to a serial number representing the 
report in which it is found and is coded for specific 
subject and is further identified as pertaining to one of 
four broad concepts; raw material, process, product, or 
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test on the product. A question involving the results 
of a particular process on a particular raw material is 
then answered by selecting the pertinent process cards 
and raw material cards and merging the two decks into 
a single sequence of serial number. Then, where a 
process card and raw material card have the same ser- 
ial number, they will appear together, indicating that 
the report carrying that number contains some informa- 
tion required for answering the question. 

Whaley (64) has described a collating system of 
document indexing which involves creation of a term in- 
dex, the assignment of "roles" to terms and the punch- 
ing of the "address" (i. e. specific location within a re- 
port) of each "structerm" into a card and the filing of 
cards by term number. Role is an indication of the 
use of a term (as, for example, "Estimate or determ- 
ine the cost of"). Structerm is defined as a term with 
a role assigned. The use of structerms represents an 
effort, reportedly successful, to reduce the number of 
references resulting from the sorting process. A card 
is punched for each structerm in each internal report 
and then the addresses are consolidated on one card 
for each structerm in a particular report. The con- 
solidation of addresses makes possible a reduction in 
the number of cards to be retained and sorted. In ad- 
dition,, Whaley proposes to limit the size of the file by 
removing cards representing old reports and sorting 
those* cards for occasional reference. 

Earlier reference was made to several articles on 
machine-sorted cards in literature searching resulting 
from the Western Reserve University Symposium, 1957. 
Following is a brief review of six of them not covered 
so far. 

McCafferty (65) presented the method developed at 
the Watertown Arsenal to obtain access to literature re- 
lating to ordnance. Using a classification system de- 
vised by ordnance personnel at Watertown, items of lit- 
erature were abstracted and classified, punched cards 
were prepared so that articles or reports could be iden- 
tified by serial number, source, publication date and 
subject. Subject and numerical indexes to the resulting 
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file were created. Bibliographic information and an ab- 
stract are contained on each card. It appears that the 
primary reason for constructing this file might not have 
existed had the Arsenal's library been adequately organ- 
ized. 

Hayne and Turim (66) described an information re- 
trieval system limited to one chemical compound (chlor- 
promazine) but which encompasses a wide variety of 
subjects. Using a modification of the coding system 
Luhn developed for use with the IBM photoelectric scan- 
ning machine (mentioned above), but adapted for the 
IBM 101, a system was developed in which ". ..there is 
no practical limit on the number of factors which may 
be specified in a search, nor on the manner in which 
they may be logically combined" (67). The system is 
not described in detail in the article at hand. 

Livingstone and Welt (68) presented a rather de- 
tailed description of an experiment in coding of data 
concerning the relationships of chemical structure to 
biological activity. The system depends essentially on 
the matching of files of cards on chemical compounds 
with files of cards on biological responses. Coded data 
abstracted from many sources are punched into the 
card in such a manner that many combinations of fac- 
tors can serve as the basis for sorting. The Weil and 
Hildenbrand (69) article is descriptive of a project to 
abstract literature on fuel and lubricant additives and to 
create an index to the abstracts on machine- sorted 
cards. The following files are maintained. Subject and 
author files of machine- sorted cards, a file of abstracts 
typed on vellum, a file of handwritten abstracts of both 
pertinent and non-pertinent materials, and one copy each 
of abstracts duplicated from the vellum slips is filed 
by patent or accession number. In addition, a code 
card is filed behind each abstract. The machine- sorted 
cards are arranged in the file by a rough compound 
classification and a color code is used in the file of 
subject machine -sorted cards. A file of trade names 
has also been developed. The machine- sorted cards to 
be sorted may be selected by the file by compound and 
the search so limited. MacKinnon, Leary and Levinson 
(70) reported on an experiment in using machine -sorted 
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cards to legal research processes. Essentially, the 
experiment involved classifying the Illinois divorce stat- 
ute and using it as a weeder code. The basic theory 
is that the Illinois lawyer concerned with a problem on 
divorce involving the statutes of another state will want 
only how the laws of the two states differ. Then, the 
divorce statutes of all other states were similarly clas- 
sified and points of difference noted. A machine search 
will reveal those points of difference and the searcher 
is thus warned to consult the actual statutes. 

A modification of classification designed for use 
with marginal punched cards to a machine- sorted card 
system, and the operation of that system is the subject 
of a report by Weil and Clapp (71). The American 
Society of Metals and the Special Libraries Association 
metallurgical literature classification is divided into 
"orders" (first-order, second-order, etc.) by the ex- 
tent of detail in each concept considered. In the sys- 
tem in question, two fields on the machine- sorted cards 
are used for recording concepts and concept designa- 
tions are superimposed on one another in those fields. 
Work cards, serially numbered, are prepared by the 
indexer, and are filed by number. Search cards (that 
is, machine- sorted cards) are punched with bibliograph- 
ic information, the work card serial number and the 
concepts indexed. At least two search cards are re- 
quired for each work card and a set of two search cards 
is required for every different first-order category in 
order to eliminate a first-order search of the entire 
file. 

It is perhaps unfair, in view of the fact that this 
system had been in existence less than a year at the 
time of reporting, to emphasize that it had then been 

used n mostly to locate specific economic figures 

such as company expansions, metal shipments, or pro- 
duction figures" (72). Or, in short, precisely the kind 
of information which should be obtainable through stand- 
ard library tools and techniques. 

This discussion of uses of machine- sorted cards 
is not complete in that it does not review all applica- 
tions of the cards to literature searching and indexing. 
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It does, however, indicate the major lines of develop- 
ment from the standpoint of methods. A further and 
more detailed review could not be accomplished without 
detailed discussion of coding methods, a subject which 
is outside the scope of this study. 



E. C Extension of Standard Tools 

Machine- sorted cards have been applied to the 
manufacture of long-accepted bibliographic devices such 
as indexes, catalogs and book lists. In general, these 
applications have been aimed either at making a central 
library's holdings accessible to its branches or to other 
libraries or individuals. The specific use of machine- 
sorted cards in this type of application is to produce 
copy from which the catalog or list may be duplicated. 
Dewey (73) has summarized current applications of this 
type and has described some methods now being used. 
It is important to bear in mind in reviewing these sys- 
tems that the machine- sorted cards and equipment do 
not produce catalogs; they produce only copy which 
must then be duplicated. Furthermore, one step in the 
process which is sometimes overlooked is the comple- 
tion of a document of some type from which the cards 
are punched by a key-punch operator. That is, while 
it would be possible to work directly from the item 
being cataloged to the punched card, there normally is 
a step in between. 

Several applications ,of machine- sorted cards to 
library- related tools were reported immediately follow- 
ing World War n. The Library of Congress listed text- 
books used in the ASTP and V12 programs and distrib- 
uted the lists to colleges in preparation for the antic- 
ipated need by returning veterans (74). Arnhym (75) 
reported on the use of machine- sorted cards for organ- 
izing, processing and distributing technical and intelli- 
gence information. No description in detail of these 
processes was located in the literature. 

Currently a project is under way at the National 
Library of Medicine (under a grant by the Council on 
Library Resources, Inc.) to combine tape-operated 
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typewriters, International Business Machines and the 
Listomatic Camera in producing the Current List of 
Medical Literature (76). While the project is directed 
toward production of the Current List, it may have side 
benefits rather similar to those activities of a mechan- 
ized documentation center. For example, it may be 
possible, once the envisioned files of cards are in ex- 
istence, to produce annual or cumulated bibliographies 
on broad medical subjects. The Current List of Med- 
ical Literature presents some difficulties to the search- 
er in that complete bibliographic citations are not given 
under subject headings and the searcher must continual- 
ly turn from the listings under subjects to a list of 
articles indexed. In the proposed method, the biblio- 
graphic citation will be typed, by tape- operated type- 
writer, on a machine- sorted card, using only the upper 
and central portion of the card. The balance of the 
card may then be used to code the various subjects 
under which the article will appear. Sorted into se- 
quence by author or by subject, the card will be passed 
through the camera to produce a film which will serve 
as copy from which the Current List will be produced. 
Up to three typed lines of "copy" may be put on each 
card, thus reducing the number of trailer cards as 
compared with the method used for New Serials Titles, 
to be mentioned later. 

A proposal to maintain a union catalog of serials 
at the Library of Congress (77) and to publish it based 
on a machine- sort card system has been made but has 
as yet not been put into effect. However, the publica- 
tion of information on serials titles newly received by 
a relatively large group of libraries has been carried 
through by the Library of Congress (78). In contrast 
to the proposed system being developed by the National 
Library of Medicine, the New Serial Titles method in- 
volves using one card for each line in a given entry. 
Copy for reproduction is then produced by use of a tab- 
ulating machine. The main entry for each serial to be 
listed requires at least one card (and commonly more 
than one). The entry cards are then followed by a card 
for each library for which holdings are indicated. 

County libraries have in many cases operated with 
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many branches and stations spread over wide areas. 
Typically, those libraries have had to operate on tax 
bases which are small, as compared to city libraries 
and in regard to the number of people served. One 
rather common technique used in those circumstances 
is to provide the branch or station with a rather small 
basic collection and then to move the newer materials 
from one location to another. In many cases, branches 
and stations have not had complete catalogs but have 
used shelf lists or other devices. Requests from the 
branch or station to the central library are then made 
without knowledge of what the main collection contains, 
and sometimes without complete knowledge of what ma- 
terial is in the branch itself. 

In the King County application (79), machine- sort- 
ed cards are used as a locator file and to produce copy 
for book catalogs of the material in each branch. The 
branch catalogs each consist of an author list, a title 
list and a subject list, the latter divided into adult and 
juvenile titles. The subjects used in the branch cata- 
logs are not identical with those used in the main 
library's card catalog, and a "key" relating the two 
systems of subject headings is supplied to each branch. 
The punched cards are also used to imprint book cards 
and date due slips with the required information. 

The Los Angeles County Public Library has used 
a different approach to the same basic problem. 
MacQuarrie (80) described that system and stated its 
benefits as: 

1. The library's entire holdings are open to in- 
spection at any service point in the system. 

2. The existence of the book catalog makes it un- 
necessary to file and pull cards for books as they are 
moved from one branch to another. 

3. Requests for material made by branches to the 
central library are accurate. 

4. Gaps in the book collection stand out more 
clearly in book form than on cards. 
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5. The book catalogs result in more effective use 
of branch collections, especially since many branches 
did not have catalogs of even their own collections prior 
to the introduction of this system. 

These systems, of course, require that a file of 
punched cards representing the book collection be cre- 
ated and maintained. They also require some system 
for keeping the book catalogs up to date. In the case 
of the Los Angeles County Public Library system, sup- 
plements are issued monthly and supplements are inter- 
filed in the bound volumes quarterly (81). 

The Columbia River Regional Library, Wenatchee, 
Washington, has used the Los Angeles County system 
(82). 

These systems are not new in the sense of having 
created a new library tool. Distribution of a complete 
catalog of the library's holdings does represent a con- 
siderable departure from existing county and regional 
library methods in the areas mentioned. Of course, 
other methods of manufacturing those catalogs could 
have been developed. 

The New York State Library, using machine- sorted 
cards to produce copy has published a check-list of its 
holdings in the social sciences (83). The aim, of course, 
is to make the materials held by the State Library more 
widely known and used. 



HE. Some Problems 

Machine- sorted cards have been applied to three 
broad areas in librarianship: first, to routine, repet- 
itive operations; second, to literature searching; and 
third, to the production of ~opy from which lists and 
bibliographies are produced. It scarcely needs to be 
pointed out that all of these operations can be performed 
by other methods; indeed, the application of machine- 
sorted cards represents the unusual rather than the 
standard operation. 
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To be of significant value, then, the machine- 
sorted card system must either be less expensive than 
other methods or it must be capable of producing some 
end result not obtainable by other methods. The infor- 
mation available in the literature does not demonstrate 
that machine -sorted cards, applied to library proce- 
dures, makes possible the achievement of either goal. 

In the first type of application (routine, repetitive 
operations), there appears to have been an assumption 
that library records or files are sufficiently similar to 
business and industrial records that at least some of 
the advertised benefits will accrue to the library which 
uses machine- sorted cards. But are they? A circula- 
tion file may seem similar to an inventory record of 
parts. However, the virtue of the circulation card is 
that it distinguishes one bibliographic unit from all oth- 
ers, while much of the virtue of the inventory card is 
that it shows the similarity or exact identity of springs 
or bols or whatever it represents. Thus, almost auto- 
matically, one may presume that most cards in a cir- 
culation will be used infrequently as compared to most 
cards in an inventory file. 

A common argument in favor of machine- sorted 
card systems is that they make available information 
which is not obtainable (or readily so) with conventional 
methods. For example, Moffitt (84), in describing the 
use of machine- sorted cards in financial control of ser- 
ials subscriptions, lists several such benefits. The 
first of those mentioned is the provision of an annual 
list of serials to the administrative offices of the li- 
brary. These lists are Tt . . . useful in answering many 
of the questions concerning serials without the necessity 
of referring to the serials unit for information" (85). 
Evaluation of such additional benefits in terms of cost 
is universally lacking. Furthermore, in this particular 
instance, it is not clear as to why the serials unit 
should not be expected to supply information whenever 
it is asked to do so, and to supply more up-to-date in- 
formation than could be had from an annual list. 

In the second type of application (literature search- 
ing), some serious questions are not covered at all in 
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the literature, yet surely they must have occurred to 
persons who have experimented with machine- sorted 
card systems. For example, one virtue of the stan- 
dard card catalog and of the printed bibliography or in- 
dex is that it has "integrity" at all times. That is, 
entries are not removed from them, or if cards are 
removed from a card catalog, temporary entries show 
at least the existence of the full entry. This virtue is 
not to be found in machine- sorted card files. In fact, 
one of the advantages of the latter type of file is that 
things can be removed from it. But what happens 
when a search must be made when another one is going 
on or has just been completed? Must one wait until 
items removed have been interfiled again? Or, does 
one conduct several separate searches of the various 
files, adding one or more files depending on how many 
have been created by previous searchers, the results 
as yet unfiled? and, if one waits, how much time does 
it take? Again what effect have the processes of cod- 
ing, machine operation and related processes on the 
training and, consequently, the salary scales and avail- 
ability of personnel? 

Neither of the two basic requirements noted above 
(that of cost savings or unique end result) has been doc- 
umented as a virtue of machine- sorted cards as applied 
to literature searching. The absence of careful cost 
data will be commented upon later, but the impression 
of a group of foreign librarians is of interest here (86): 

The present use of IBM cards for selecting in- 
formation using commercially available machines 
has not achieved anything which cannot be achieved 
by traditional methods, although it may in certain 
cases be cheaper than manual methods, but ad- 
equate costing data were not obtained. 

Thus, the basic questions have not been answered, and 
other serious questions have been raised but also left 
unanswered. 

In the third area (production of copy for catalogs 
and lists), the case is also unclear. Many devices 
standard type-setting machines, photo-offset machines 
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-- and various methods (87) exist for producing copy to 
be duplicated. The literature does not reveal a single 
attempt to detail the present cost of producing copy for 
a catalog or list as compared to the cost where other 
devices and methods are used. Neither does it show 
careful analysis of the end product, including factors 
such as storage capacity per page of copy. 



IV. Cost Data 

A few of the articles reviewed for this project 
contain some information about the costs of the system 
described. In at least one case (88) there is evidence 
to indicate that more or less complete cost information 
might be available for inspection. In another case (89), 
a cost study is in progress. Fragmentary data is re- 
corded in a few other cases, and it may be that in 
some of those instances cost data might be obtainable. 

To be reliable, cost data must be sufficiently de- 
tailed and collected with sufficient care that one could 
reconstruct a procedure and rely upon being able to 
predict its performance and cost with fair accuracy. So 
defined, cost data does not exist for applications of 
machine -sorted punched cards to libraries in any in- 
stance. Or, if it does, it is not obtainable in the lit- 
erature. 



V. Suggestions for Research 

Future research in the application of machine- 
sorted cards to libraries will be most fruitful if it can 
yield some detailed information on the two key questions 
formulated earlier: 

1. Can things be accomplished through using ma- 
chine-sorted cards which cannot be accomplished 
through other methods? 

2. What are the costs of machine- sorted card sys- 
tems as compared to other methods? 
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Following the general outline of this paper, other 
important questions, not answered in the literature, are: 

I. Routine, repetitive operations 

A. What operations in libraries are best suited 

to the application of machine- sorted cards? 

B. Can some principles or general rules be de- 

veloped by which to estimate success in ad- 
vance of actual installation of machine- 
sorted card systems? 

C. Can data be developed which will show with 

some precision the effect of machine- sorted 
cards and related equipment on the total 
process? That is, can the benefits of ma- 
chine-sorted cards be isolated from the 
other parts of total systems so that it may 
be possible to combine various devices or 
methods into a "best" system? 

n. Literature Searching 

A. What is the effect of searching a file upon 

succeeding searches in terms of adequacy 
of coverage, time required and the oppor- 
tunity for error? 

B. What demands do machine-sorted card sys- 

tems make, from the encoding process to 
the final product, upon the personnel who 
operate the system in terms of training, 
supervision and availability? What do these 
demands imply for the library in personnel 
recruiting and instruction? 

HI. Preparation of copy for lists and catalogs 

Because many tools and methods for pre- 
paring copy for reproduction exist, the most 
important question to be resolved here will 
be that of cost as compared to other meth- 
ods. Information on speed of production of 
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copy, ease of preparing cumulations and 
storage capacity of the resulting page will 
also be important, but, again, on a com- 
parative basis. 
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Volume Four Part Four 
ELECTRONIC SEARCHING 

by 
Gerald Jahoda 



I. Introduction 
A. Scope of the Survey 

The literature on the use of electronic machines 
for information handling and closely related operations 
is reviewed in this report. A differentiation has been 
made by Mooers between two types of systems: an 
information retrieval system, i. e. a system which 
comes up with serial numbers or bibliographic citations 
of documents in which the answer to a question might 
be found, and a question answering system, also called 
a data file, which yields the answer directly e. g. a 
chemical, a property of a substance. This differentia- 
tion we consider useful and it forms the basis of ar- 
ranging these two types of information systems in this 
report. Every other type of application has been 
grouped under the commonly used miscellaneous heading. 
In this group machine applications which might assist 
the librarian in the intellectual task of preparing his 
subject authority list are included, as are applications 
aimed at eventually relieving the librarian from the in- 
tellectual tasks of abstracting and indexing. 

The literature on language translation by machine 
has been omitted not because it is not pertinent to the 
overall problem of information retrieval (it is) but be- 
cause of the vast literature in this field and this review- 
er's lack of special knowledge of it. Business applica- 
tions of computers for inventory and various accounting 
operations and scientific and engineering calculations by 
computers have been omitted as being outside the scope 
of this review. 



B. Machine Operations and Characteristics 
Modern data processing machines combine three 
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major techniques which are described by Alexander: 

1. Transmitting encoded information as elec- 
trical signals, in the sense that one transmits a 
telegram consisting of both numerical and alpha- 
betic texts. 

2. Means for storing such encoded information 
in a form that permits the information to be re- 
covered selectively as electrical signals when 
needed. 

3. Means for processing encoded information in 
accordance with the rules of simple arithmetic and 
the rules for elementary logic (1). 

If optical signals are added to electrical signals, then 
the description also applies to machines which process 
data on film, namely the Rapid Selector, the Filmorex, 
and the Mnicard systems. Before any use can be 
made of these internal characteristics of the machines, 
information which is to be processedindex entries, 
i e index headings and some kind of document identif- 
ication has to be translated into a form which is easily 
handled by the machine. This consists of translating 
the index headings and the document citation into short 
hand symbols, generally numbers, called the code. The 
code is then converted into a machine language, i. e. a 
language which the machine can understand. This is 
done by converting the numbers into a series of trans- 
parent and opaque patterns for machines which sense 
optical signals, or by converting the code into holes on 
punched cards or punched tape. In most computer based 
systems the code on punched cards and punched tapes is 
converted into magnetized spots on magnetic tape or 
magnetic discs. 

The machine is now ready to manipulate this data, 
that is, to perform simple arithmetical and logical op- 
erations with the data, if it is instructed exactly what 
to do. The sets of instructions are called the program. 
Programming operations vary from machine to machine. 
In the case of machines which sense optical signals this 
may be done by inserting a mask which complements the 
pattern being sought in the reading station. In the case 
of punched card sorters electrical contacts have to be 
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made inside the machine by setting dials or switches or 
by making these connections with a control panel board. 
A control panel board, also known as a plug board, con- 
tains a number of holes into which wires are inserted. 
These wires make connections with wires inside the 
machine when the control panel is put into place. In 
the case of computers, programming is done by sending 
instructions into the machine on punched cards or 
punched paper tape or magnetic tape. 

The program instructs the machine what to do 
with the encoded data, i. e. the index entries. The 
basic instruction is to match a search heading against 
an index heading. Most machines can match more than 
one search heading against more than one index heading 
in a document. The type of operation which a machine 
can perform is often represented by symbolic logic. An 
index heading or a part of an index heading is repre- 
sented by a letter, for the sake of convenience. Thus 
any 5 index or search headings or parts of index or 
search headings can be represented by A, B, C, D, E. 

A search for a document containing either A or B 
or C or D or E is called a search for a logical sum. 

A search for a document containing A and B and 
C and D and E is called a search for a logical product. 

A search for a document containing A but not B 
is called a search for a logical difference. 

The programmed machine scans the encoded rec- 
ord and recognizes items which meet search specifica- 
tions. Once the search has been conducted the machine 
has to produce the results in a form understandable to 
human beings. In some machines this is done by sep- 
arating the units of information; e. g. punched cards 
which identify pertinent documents are physically sep- 
arated from other punched cards. In other machines 
this is done by printing out the documents by identifying 
serial number. In still other machines the search re- 
sults are copies of the abstracts of pertinent documents. 

Machines differ in speed, versatility, and cost. 
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The difference in versatility in terms of potential capa- 
bilities for information retrieval has been analyzed by 
Perry, Kent, and Berry. They state that subject con- 
tent of documents may be significantly distinguished on 
the basis of the following two different types of char- 
acteristics: 

Type I Characteristics: Spatio-temporal entities 
(substances, devices, organisms, persons, etc.), 
attributes, abstract concepts, processes, locations, 
and conditions involved; 

Type II Characteristics: Relationships involved 
between the entities, attributes, concepts, pro- 
cesses, locations, and conditions. 

In ordinary writing, the first type of character- 
istics is usually designated by nouns, adjectives, 
verbs, and adverbs. Relationships --our Type H 
characteristics- -on the other hand, are denoted 
by the phrasing of sentences, by such grammatical 
devices as endings and other affixes or connect- 
ives such as prepositions (2). 

Different types of machines are analyzed by these au- 
thors in terms of allowing searches of various complexity 
with Type I characteristics (descriptors) and Type II 
characteristics (relationships among descriptors). 

Type I devices are the simplest possible devices 
which would record only one characteristic for each of 
the documents and would direct searching operations to 
the characteristic or characteristics that correspond to 
the item or items desired. An example of this type of 
device is the National Bureau of Standards Microimage 
Selector, in which a document is merely identified by 
its location. 

A Type n device is illustrated by the Rapid Select- 
or, as demonstrated several years ago at the United 
States Department of Agriculture (3). Each unit of in- 
formation on a frame of microfilm was identified by 6 
index entries (more than one frame could be used per 
item of information). Only one entry could be searched 
at any one time. This application of the Rapid Selector 
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could be construed as a speeded-up search in a conven- 
tional index. 

Type in devices can be used to record a multi- 
plicity of criteria (descriptors) in fixed zones or fields 
on the recording medium. The search can be made 
for one or more criteria in fixed zones. Standard IBM 
or Remington Rand sorters are examples of Type HI de- 
vices. In the case of single column sorters, searches 
for combinations of for example-4 descriptors have to 
be made by sending the cards through the machine 4 
times. In the case of multiple column sorters a sim- 
ilar search might be completed in one sort through the 
machine. An important limitation of these machines is 
their inability to search for Type n characteristics, 
namely relationships among descriptors. 

Type IV devices are row-by-row punched card 
scanners as exemplified by the Luhn scanner. In ad- 
dition to being able to do everything that Type EH de- 
vices can do. Type IV devices can treat consecutive 
IBM cards as one continuous unit. Type n character- 
istics can also be indexed and searched. The combina- 
tion of these factors makes the Type IV device a much 
more flexible device for searching. 

Perry, Kent, and Berry's Type V device can han- 
dle Type n characteristics in a more sophisticated man- 
ner than that ascribed to Type IV devices. For example, 
the machine can differentiate between n Man bites dog" 
and "Dog bites man, " "Blind Venetian" and "Venetian 
blind. " The machine has also the capability of detect- 
ing the beginning and end of sequences of descriptors. 
Since Perry, Kent and Berry* s text on machine litera- 
ture searching was written, a Type IV device- -the ILAS 
(Interrelated Logic Accumulating Scanner) has been 
built which is capable of differentiating such relation- 
ships. Consequently the difference between Type IV and 
Type V criteria is no longer as sharp. 

Perry, Kent and Berry's Type V criteria is a re- 
finement of their Type V device. The machine can "in- 
terpret" the meaning of a descriptor by means of a look- 
up table in its memory before starting on its matching 
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operation of descriptors and descriptor connections (4), 
A general purpose computer based system such as the 
IBM 704 Electronic Data Processing Machine can prob- 
ably be programmed to include Type VI device criteria. 

Machines which can recognize electrical signals 
can be grouped into 3 categories, which are arranged 
in increasing order of complexity: 

1. Machines which can only handle information 
entered on one punched card as the largest unit. These 
machines can recognize combinations of descriptors but 
cannot recognize stated relationships among descriptors. 
Conventional column-by-column scanning punched card 
sorters are examples of this type of machine. 

2. Machines which can handle information entered 
on one or more punched cards as a single unit. These 
machines can recognize combinations of descriptors and 
stated relationships among descriptors. This is not be- 
cause of any additional capacity for performing logical 
or mathematical operations in the machine but because 

a code which indicates relationships among descriptors 
can only be conveniently prepared if more than the 80 
columns of a punched card can be used as a continuous 
unit for any one document. Row-by-row scanning 
punched card sorters such as the ILAS are an example 
of this type of machine. 

3. Machines which can handle information entered 
on one or more punched cards (or the equivalent on 
magnetic tape) as one unit and can store preliminary 
results of the searches in their memory. These ma- 
chines can recognize combinations of descriptors, stated 
relationships among descriptors, and can introduce var- 
ious refinements in the search. An example of this is 
the assignment of a numerical value to each descriptor 
in a search based on its significance for that particular 
search and the selection of documents which contain 
descriptors with a sum total of a stated numerical value. 
Computers, such as the IBM 704 Data Processing Ma- 
chine, are an example of this type of machine. 

We may also list briefly the individual components 
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or units of computers and Type I and II punched card 
sorters, A digital computer generally consists of these 
five parts: 

An input unit where information is received from 
the outside world in machine language; 

A storage or memory unit where information is 
stored or remembered and ready for use in a 
large number of locations or registrars; 

An arithmetic unit where information is operated 
on arithmetically or logically; 

A control unit, the unit which controls the switch- 
es or gates that connect specific registers in the 
units and thus controls the sequence of operations 
in the computer; 

An output unit where information is returned to 
the outside world, usually in a form readable by 
human beings. 

These five units are connected so that instructions and 
information (both in the form of electrical impulses) 
can flow from one unit to another (5). 

Punched card sorters, on the other hand as ex- 
emplified by the IBM 101 Electronic Statistical Machine 
--are much simpler devices. This particular multiple 
column sorter consists of several components all incor- 
porated into one unit. These components are: 

The input station, which is the card feeding de- 
vice, also known as the hopper; 

The card reading station, which is equivalent to 
the computer's arithmetic unit in that it is the 
part of the machine where information on individ- 
ual cards is examined; 

The instruction station, which in this case is the 
control panel where wires are connected to the 
card reading station to perform the desired oper- 
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ation; 

The output station, where information is returned 
to the outside world either as punched cards sort- 
ed in a particular pocket or as a line of print 
generated by the machine's print-out unit. 

The row-by-row scanning punched card sorter, as 
exemplified by the ILAS, has a separate instruction 
station which is connected to the input, card reading, 
and output component. This particular machine does 
not have a print- out component. 

Detailed descriptions of these machines can be 
found in manufactured instruction manuals such as 
the IBM 101 manual (6) or in general texts on comput- 
ers such as the books by Berkeley and Wainwright (7) 
and by Chapin (8). 



n. Information Retrieval Systems 
A. Types of Indexing Systems 

The shopper for an indexing system can make his 
selection from several packages, the contents of which 
can be custom blended. He can choose a well-worn and 
long-tested traditional package, the alphabetic subject in- 
dex, the alpha- classified index, or the classified index. 
He can also choose a newer package, some form of coord- 
inate index. Or he can make a reservation for the package 
that is still a gleam in the manufacturers eye, the index 
prepared by machine. 

Coordinate index is a generic term for an indexing 
system which departs from the one-to-one relationship of 
the index heading to the citation as a unit as it exists in 
conventional indexing systems. On one physical unit such 
as an index card, a punched card, or a section of paper or 
magnetic tape, more than one index heading is generally 
associated with one document citation, or else more than 
one document citation is associated with an index heading. 
This is illustrated by the following unit records of a con- 
ventional indexing system and two coordinate indexing sys- 
tems: 
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Conventional alphabetic subject index 



Jet fuelSj thermal stability 
# 1235 



Coordinate index, conventional grouping: 
index heading on document card 



1235 
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Coordinate index, inverted grouping: 
document numbers on index heading card 



Jet 





1 


2 


3 


4 


5 


6 


7 


8 


9 


120 


311 


12 






1235 








689 




411 


52 




















62 

















In the illustrated examples the serial numbers identify 
the document. The serial number is the equivalent of 
an abbreviated citation. The holes on the punched card 
stand for index headings which are assigned to that par- 
ticular document. 

The physical association of more than one index 
heading to a document citation or more than one docu- 
ment citation to an index heading permits manipulations 
of the coordinate index which are not possible with a 
conventional index. A search can be made for a docu- 
ment which has two or more index headings in common, 
which has one index heading but not another, or which 
has one of several specified index headings, by a me- 
chanical matching of numbers or holes* (The critical 
reader will no doubt say that this is what we are doing 
all the time when we search a card index under one 
heading and make our decision of accepting or rejecting 
an item on the basis of the existence of another heading 
mentioned in the tracings. This is certainly true; but 
with the coordinate index one can carry this to greater 
heights of sophistication and one can do it by machine. ) 



Greater advantage is taken of this ability to coord- 
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inate headings by splitting down index headings into 
smaller but more generic units, usually called descript- 
ors. For example, the index heading n jet fuels, ther- 
mal stability" might be broken down into 4 descriptors: 
thermal, stability, jet, and fuels. Each of these de- 
scriptors, either singly or in any combination, can then 
be used as an access point to the index. 

By splitting up the conventional index heading into 
descriptors, the relationships among the individual parts 
of the index heading are lost. In some of the coordinate 
indexing systems this loss is not considered serious. 
The argument is that information will not be lost but 
that additional though extraneous documents will be se- 
lected. It is easier to separate these extraneous docu- 
ments manually than to prevent their appearance in the 
first place. In other coordinate indexing systems the 
relationship among descriptors in a document (other than 
the sometimes accidental and misleading relationship of 
being in the same document) is brought out in the index 
in order to reduce the yield of extraneous documents in 
a search. 

The third type of index is in its early development 
stage. It differs from the conventional and coordinate 
index not so much by its form of entry but by its meth- 
od of preparation. After the preliminary work of the 
librarian is completed the machine will prepare the in- 
dex to the documents, and given a request in essay 
form- -the machine will translate it into its index lang- 
uage and perform the search. 



B. Intellectual and Mechanical Aspects 
Information Retrieval 

Indexing systems, whether conventional or coord- 
inate, are shaped by two major factors: the stipulated 
users' requirements and the financial resources for de- 
veloping and operating the indexing system. These fac- 
tors are based on the following considerations: 

Type of user: research scientist, patent attorney, 
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administrator, engineer, student or general public; 
specialist's use of system in his field or in some 
one else's field; 

Size of installation in terms of number of docu- 
ments: present size, rate of acquisition, expect- 
ed maximum size; 

Subject matter of collection* its homogeneity or 
heterogeneity; 

Other available indices to collection; 

Type of anticipated searches: specific, generic, 
correlative, predominance of any one type of 
search; 

Frequency of searches: per given periodj, at one 
time; 

Search results: required completeness, required 
up-to-dateness, required speed of completion, re- 
quired freedom from non-pertinent references; 

Type of use of index: on a self-service basis, 
through librarian, in multiple locations, in central 
location; 

Users* and managements attitude toward existing 
index; 

Time and resources available for development 
work on system; 

Time and resources available for incorporating 
backlog material into system to make it useful at 
an earlier date; 

Time and resources available for routine opera- 
tions of system; 

Availability of data processing machines. 
The sum total of these variables constitutes the environ- 
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ment of the system. Just as different variables make 
up the environment of the system,, so the indexing sys- 
tem itself is made up of a number of variables. Call- 
ing an index an alphabetic subject index or a coordinate 
index identifies its genre in a broad way but gives 
little information about the index itself. Some of the 
variables in an indexing system will, therefore^ now be 
identified. For the sake of convenience these variables 
will be grouped into intellectual and mechanical aspects 
of the index. 

Intellectual Aspects 

The indexing and searching operations in any given 
installation and for any system except one in which in- 
dexable information is selected by machine involves the 
following operations: A document selected for inclusion 
into the system is read for indexable information as 
specified by a set of rules. The depth of indexing will 
influence the time spent in reading (or scanning) the 
document. A time limit for reading any one article 
might be set or instructions might be given to read only 
certain parts of the document, such as the summary, 
the table of contents, the conclusion, or the abstract. 

The reading so far has been for indexable infor- 
mation. The translation of this information into index 
language, again according to a set of instructions, is 
the next step in the process. New terms are incorpor- 
ated into the system at this stage, either as indexing 
terms or as cross-references. The last step is a me- 
chanical one and consists in the actual preparation of the 
index entries according to the various systems and in 
the finding of a "parking place n for the indexed docu- 
ment. 

For the retrieval operation some of the above 
steps are reversed. The inquirer poses his question 
in terms of the index language. This again requires 
the translation of communicative language into the index 
language and is often done with the assistance of the in- 
dexer or some one else familiar with its operation in 
order to bring the thoughts expressed in two relatively 
different languages into coincidence. The index is then 
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searched under these entries, either manually or mech- 
anically, by the final user of the information or by 
some intermediate or by machine. The question may 
be phrased differently a second time if the wrong index 
headings have been selected or if the inquirer adjusts 
his search as a result of information obtained through 
the first search of the index. This process may be 
repeated several times. The number of times is de- 
pendent on several factors, such as necessity for com- 
pleteness of information, amount of time available, per- 
severance, and ability to define what is wanted. 

A number of variables or decision points are in- 
volved before the indexing system can be put into opera- 
tion. 

Selection of documents to be included into system: 
The selection might be by form of literature, 
e. g. all internal reports, or by subject, e. g. all 
published and unpublished information on a partic- 
ular group of chemicals. 

Depth of index: 

This involves the amount of information in doc- 
uments to be included in the index. To use ex- 
tremes, index entries might be made from the 
title only or from every bit of information con- 
tained in the entire document. 

Point of view of index: 

This will depend on the present and anticipated 
interests of the users. Decisions have to be made 
whether to index from the author's point of view 
only, from the present user r s point of view, from 
all possible points of view, or from any point on 
this spectrum. 

Specificity of the individual index entry: 

This is the amount of detail included in the in- 
dividual index heading. Decisions will be based 
on the size of the installation, the amount of in- 
formation now available and anticipated on a given 
subject, the user T s interest in a given subject, and 
the indexer T s philosophy on retrieving non-pertinent 
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information. 

Structure of the index heading: 

A choice has to be made between words as 
headings, phrases as headings, and in some co- 
ordinate indexing systems arbitrarily defined units 
which are combined to make up the index heading. 
An example of the latter is the characterization 
of a chemical not by its name but by two or more 
identifying characteristics such as the number of 
carbon atoms 5 the type of bonds connecting the 
elements which make up the chemical, or the 
functional groups which are present in the chem- 
ical. The index may also be made up of several 
levels of generality as in a classification system. 

The arrangement of the index headings: 

The arrangement of the list of index headings 
can be alphabetical, classified, or alpha- classified. 
In case of a classified arrangement an alphabetic 
index to the system is necessary to permit access 
to the information. A variation on the traditional 
hierarchical classification is the grouping of the 
terms in a small number of relatively broad cat- 
egories. The total number of terms is often small 
so that the terms in each pertinent category can be 
read whenever an item is indexed or whenever the 
index is searched. 

The arrangement of the constituent parts of the in- 
dex heading: 

This is to a certain extent a factor determined 
by the type of indexing system. The arrangement 
will range from a completely ordered set of terms 
in the case of a facetted classification system to 
no particular order whatsoever in some alphabetic 
indexing systems Coordinate indexing systems 
consisting of single words only sidestep this par- 
ticular problem. 

Relationships among indexing terms: 

The relationships among terms in a convention- 
al indexing system, i.e. an alphabetic subject, al- 
pha-classified, or classification system, are 



Retrieval Systems 155 

brought out by word order (blind Venetian vs. 
Venetian blind), prepositions (reaction of benzene 
vs. reaction in benzene), and punctuation symbols 
(Chemistry, analytic, vs. Chemistry-Bibliography). 
In coordinate indexing systems these relationships 
are either not brought out (descriptors are mere- 
ly listed but not related to each other) or are 
brought out by means of the following devices: 

1. The modification of the descriptor to re- 
duce the scope of its meanings: Benzene (re- 
actant) or benzene (solvent); 

2. The assignment of an additional code to 
denote descriptor order or relationships among 
descriptors: Venetian--!, Blind-~2. 

Control of vocabulary of indexing terms: 

The two extreme cases are a subject authority 
list for all indexing terms used in the system and 
the selection of terms from the documents without 
any control of the indexing vocabulary. The sub- 
ject authority list, a defined list of indexing terms 
along with a network of cross-references, is un- 
questionably the preferred approach. Intellectual 
and cost problems, however, often preclude its 
preparation. 

Number of indices to the collection: 

In addition to any published index available for 
the collection, more than one index is sometimes 
prepared. This is particularly desirable when 
part of the information has to be indexed in great- 
er detail and/or doesn't fit into the overall pattern 
of the index. 

Completeness of search results and amount of ex- 
traneous material retrieved along with the pertin- 
ent search results: 

These two factors are interrelated since most 
systems cannot be designed to yield all the pertin- 
ent material without any non-pertinent material. 
The desired amount of pertinent material and the 
tolerated amount of non-pertinent material will 
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have a bearing on the design of the systeiru Two 
types of non-pertinent results exist, only one of 
which occurs in conventional alphabetical, alpha- 
classified, and classified systems. The type of 
non-pertinent result which occurs with convention- 
al and coordinate indexing systems is often re- 
ferred to as the "noise" of the system. It occurs 
when the selected document falls within the de- 
fined scope of the index heading but is of no inter- 
est for that particular search. An example of 
this would be a search for poodles in an index in 
which the most specific heading was TT dogs. Tt A 
document on cocker spaniel is legitimate as far 
as the system is concerned but it is of no use for 
the search in question. 

The second type of non-pertinent search result 
is called a false drop and it occurs only in coord- 
inate systems. It is due to the interaction of un- 
related indexing terms in the same document or 
the interaction of unrelated parts of the code. This 
is illustrated by the following example: Two sub- 
jects which occur in the same document are: 

Property X of Chemical A 
Property Y of Chemical B 

This is indexed as Property X, Chemical A, Prop- 
erty Y, and Chemical B--four indexing units, 
called descriptors, which are tied together by 
means of a common document serial number (the 
document's identification). The relationship among 
the 4 descriptors is not specified. Consequently, 
a search for Chemical A which has Property Y 
will yield this particular document even though it 
does not contain the desired information. 

Mechanical Aspects 

The manner in which information is stored, the 
way it is searched, and the form of search results ob- 
tained depend to a large degree on mechanical aspects 
of the system. 
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Order of descriptor (index heading) to document 
citation (document serial number): 

Three choices are possibles the listing of all 
descriptors for a document under that document 
number, the listing of all the document numbers 
which are characterized by a descriptor under 
that descriptor, and the listing of one descriptor 
to one document number. 

Type of code: 

In both conventional and coordinate indexing 
systems, the index heading, or the descriptor, 
may be written as words (as in the case of the 
alphabetical subject index or the manual Uniterm 
system) or as short hand symbols, called either 
the notation or the code (as in the case of clas- 
sification systems or hand and machine sorted 
punched card systems). The code in a* hand or 
machine sorted punched card or computer based 
system is translated into a form which is best 
manipulated or "read" by the machine and which 
makes most efficient use of the available space. 
Several types of codes are exemplified: 

Direct code: Each position is assigned a 
meaning completely independent of meanings 
assigned to other positions. For example, the 
position characterized by column 22 row 8 on 
an IBM card might stand for the descriptor A; 

Indirect code: The significance of a given 
position is dependent upon its combination with 
another position. An example of this would be 
the assignment of column 24 row 9 and column 
30 row 1 of an IBM card to mean descriptor B. 

Much of the coding in machine based systems is of 
the indirect variety. The example above is called 
a numerical code. Other varieties make use of 
combinations of numbers available from 4 positions* 
For example, any number from to 9 can be ob- 
tained from 4 positions assigned the meaning 7, 4, 
2, 1 and using a maximum of 2 positions for each 
number. (0=no punch, 1=1, 2=2, 3=2 and 1. . . 
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9=7 and 2.) 

A random superimposed code is another type 
of indirect code. In this code 2 or more numbers 
are entered in a code field (a number of columns 
on an IBM card, for instance) in such a way that 
the numbers partially overlap or are superimposed 
on each other. This is illustrated in the following 
10 column code field into which 3 codes of 4 
positions each are entered: 



11 



21 



41 



51 



61 



71 



81 



91 



12 



22 



32 



52 



62 



72 




13 



23 



33 



43 



73 



83 



93 



24 



34 



44 



54 



64 



74 



84 



94 



15 



35 



45 



55 



65 



75 



85 



95 



16 






26 



36 



46 



56 



66 



76 



86 



96 



37 



47 



57 



67 



77 



87 



97 



18 



28 



48 



58 



68 



78 



88 



98 



19 



29 



39 



49 



59 



69 



79 



89 



99 



10 



20 



30 



40 



50 



60 



70 



80 



90 



100 



Codes: 2, 14, 25, 92 
17, 38, 42, 63 
27, 31, 53, 82 

The interaction of individual positions of these 
codes produces erroneous combinations such as 
2, 14, 17, 25 etc. By a judicious assignment of 
codes, however, this type of erroneous or false 
combination can be kept to a minimum in the 
searching operation. 

Another differentiation which should be made 
is between a fixed field code and a free field 
code. All of the codes illustrated above are 
fixed field codes in that they refer to a particular 
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portion or combination of positions on the storage 
medium. In a free field code a particular pattern 
of codes rather than a position of codes is spec- 
ified. An example of a free field code is a code 
which is characterized by a pattern of punches as 
represented by punching the first, third, and fifth 
punch in a row on an IBM card. The search now 
is not for position 1, 3, and 5 in a specified col- 
umn of the IBM card but for this pattern found 
anywhere (or in a restricted number of columns) 
on a card. 

The operation of a given machine will dictate 
the way in which any of the above codes are en- 
tered on the storage medium. In machines which 
sort punched cards, column by column, e. g. the 
IBM 101, the code is most frequently entered as 
standard alpha-numeric Hollerith code. Any digit 
from - 9 is entered as a punch in one of the 
10 positions of an IBM card column (the eleventh 
and twelfth positions are disregarded when num- 
bers are punched). A letter is punched as 2 po- 
sitions in the 12 positions of the column, one of 
which is in the eleventh or twelfth column. In 
most computer based systems the code is trans- 
lated into binary digits, that is, into a system 
where any number is represented by units utilizing 
the base of 2. This unit represents 1 or 0, trans- 
lated into electrical impulses in an on or off po- 
sition. Any number from the more conventionally 
used system with a base of 10, i. e. the decimal 
system, can be translated into the binary system. 
For example, 0=0, 1=1, 2=10, 3=11, 4=100 ? 5=101, 
6=110, 7=111, 8=1,000, 9=1,001, 10=1,010, 11= 
1, 001. Not all binary codes are translated into 
the decimal system or make use of the decimal 
notation. Another notation used with the binary 
code is the hexadecimal notation which uses a 
base of 16 instead of the base of 10. The 16 com- 
binations are written in ordinary language as digits 
- 9 and letters A - F. 
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Logical Operations 

The basic operation involved in literature search- 
ing is the matching of search requirements (one or 
more index headings or descriptors) against the index 
headings or descriptors in the document. In manual or 
machine coordinate indexing systems the descriptors in 
the question and in the documents are translated into 
numbers or patterns of holes and a search consists of 
the matching of these numbers or patterns. Most 
searches require the matching of more than one set of 
numbers or patterns in a specified way. The type of 
matching which can be done has already been described 
as logical product (search for descriptor A and B), log- 
ical sum (search for descriptor A or B), and logical 
difference (search for descriptor A but not descriptor B). 
Another decision has to be made whetheF~bnly the com- 
bination of descriptors is to be matched or whether the 
matching has to take into consideration the stated re- 
lationship among descriptors. 

Form of Search Results 

The results of a machine search can be the serial 
numbers of ducuments of probable interest (printed out 
or interpreted on punched cards), the bibliographic cita- 
tions, abstracts, or even full copies of documents of 
probable interest, either printed out by the machine or 
reproduced photographically by the machine. 



C. Organization of Section 

Since the purpose of each of the installations is 
the same, namely to provide pertinent documents in 
answer to a stated or anticipated need, attempts have 
been made to include the same type of information about 
each installation and to arrange it in the same way. The 
planned arrangement of each description of an installa- 
tion is outlined: 

Machine characteristics: A brief description 
of the machine and any modification or change 
which has been made from the standard model of 
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the machine is included. 

System environment: Information about the 
user, the collection, the indexers, and the start- 
ing date of the system is included. 

Index: The type of index which is used, the 
size of the index vocabulary, the control which is 
exercised over the index vocabulary, the specific- 
ity of the index heading ? and the depth of the in- 
dex are mentioned. 

Code: The type of machine code uses is indi- 
cated. 

Use: The kind and frequency of use is report- 
ed. 

Claims: Any cited advantages of this system 
over any other system, conventional or coordinate, 
are indicated. 

Evaluation: Comments about the system, ei- 
ther reported in the literature or made by the writ- 
er of the present report, are given. 

For most installations, information about one or 
more of these points was not reported. This is men- 
tioned in many cases since the lack of information a- 
bout an important point such as the use of the system 
will have a bearing on its evaluation. 



Electronic Information Systems 



Section D - Photoelectric Systems 

Part l a Systems Using IBM- Type 
Punched Cards and Sorted 
by Photoelectric Methods 



Samain's Electronic Selector 

Samain,, in the 1940 T s, attempted to move away 
from the limitations placed on punched cards by fixed 
field coding and to develop a system using Hollerith 
cards which would provide greater versatility. This 
system has been described by Samain in several articles 
(9,10) and by other writers in briefer descriptions (11, 12). 
Several foreign patents and one U. S. patent (13) have 
been granted to Samain. 

The system provides that each of the twelve rows 
of the cards is divided into two portions, making 24 
sections of 40 positions each. Each 40-position section 
can represent a six-letter code word. Each letter is 
coded by one or two punches in a six-position segment 
of the 40-position row. The extra spaces can be used 
to record logical or syntactical relationships (14). The 
coding system gives the card a capacity of twelve words 
of thirteen letters, or twenty-four words of six letters, 
or thirty-six words of four letters. An artificial six- 
letter vocabulary then has the capacity of 60 million 
different words (15). 

A special typewriter is used to record the terms 
by making perforations in the cards (16). 

A selector is provided which is: 
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easily adjustable to any term and which reads 
the cards one by one at the speed of approx- 
imately 400 cards per minute. The reading 
will be performed with photo-electric cells by 
a special process. With successive selections, 
we can select the cards relating to any group 
of terms (17). 

Shaw, in his review of various information sys- 
tems, indicates that photo-electric cells were not used 
but that: 

In searching for information on the card, brush- 
es were used by Samain to make contacts 
through each point at which the hole had been 
punched, just as in normal electrical Hollerith 
searching, but these pulses were fed into an 
electronic memory, which stored the pulses. 
When the pulses match the combinations of 
pulses set into the memory as the subject of 
the search, that particular card is dropped in- 
to a pocket, just as is the card in normal 
Hollerith searching (18). 

He states that: 

This mechanism was an interesting experiment- 
al development, which has never been pursued 
to a conclusion as to whether it may usefully 
be applied, and the inventor has given it up in 
favor of a modification of the Rapid Selector 
(18). (See Filmorex) 

IBM Photoelectric Scanner 

The IBM photoelectric scanner, known as the 
"Luhn Scanner, tf or Tt Luhn Machine" has been described 
in detail by Luhn (19) and in a number of reports in 
Chemical and Engineering News (20, 21, 22). The sys- 
tem used, like that of Samain, does not require fixed 
field coding and operates by photoelectric processes (18). 
In this system, the code consists of five punches in the 
twelve positions of the IBM card column. The combina- 
tions derived from the five punches in the twelve punch 
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field: 

are grouped in a number of series which are 
assigned to sets of alphabetic, numeric and 
special characters, including lower and upper 
case, as available on standard typewriters. 
Certain series represent 2-digit numbers by 
a single combination (or pairs of letters) there- 
by cutting space requirements for such infor- 
mation in half as compared with present IBM 
card coding. Division marks for separating 
words are part of the letter code, an arrange- 
ment favoring compactness of recording (23). 

In scanning, inquiry cards are punched with a 
complementary pattern of the seven holes not punched 
in the document cards. The cards are passed length- 
wise past a photoelectric scanning station at the rate 
of 1,000 cards per minute. When the opaque portions 
of the inquiry card match the holes of the codes on the 
document card, a blackout results and the mechanism 
is activated to drop the card into a special pocket. One 
photocell scans from one to four columns and the photo- 
cells may be wired to act independently or to produce 
various logical combinations (24). 

A switch on the card punch enables it to punch 
either the "question" or "answer" card. A sorter and 
transcriber have also been developed as accessories to 
the system. The machines are of the same order of 
complexity as standard IBM machines, and in their 
early development, it was expected that they would not 
cost any more than standard IBM machines (24). 

The five-hole codes yielded 792 possible combina- 
tions per column, thus allowing for complete upper and 
lower case alphabets, two-digit numbers, special sym- 
bols and operations, and one hundred two-letter combin- 
ations (25). These two-letter combinations were worked 
out in a series named the "Luko" series (26). 

The prototype equipment was completed in 1950 
and first publicly demonstrated in New York at the 
World Chemical Conclave, September 1951. The ma- 
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chines were then experimented with by Perry at the 
Center for Scientific Aids to Learning at Massachusetts 
Institute of Technology where it was reported that they 
were highly satisfactory for searching purposes with the 
speed limited only by the maximum rate at which the 
cards could be handled, that is 600 per minute (27). 

Methods were worked out whereby a group of 
cards could be scanned as a unit by means of a hold- 
over device. This allowed for the simultaneous scan- 
ning of a group of cards which pertained to a single 
document. Counters were also to be added to the ma- 
chine to indicate the number of cards satisfying a cer- 
tain criterion or a combination of criteria being searched, 
so that one pass of the cards could select a generic 
category and at the same time count the number of 
cards in each sub-category (21). 

A patent was granted in 1955 to H. P. Luhn for 
the photoelectric device for scanning cards. The de- 
vice described in the patent contained two scanners, 
each capable of covering four fixed-field locations on 
the card. The patent claim indicated that desired cards 
are selected by means of a relay device operated when 
the scanners receive no light rays through the combin- 
ation of master card and specimen card (28). 

The IBM photoelectric scanner was not put into 
production by IBM because of the cost of incorporating 
the five-hole code. The new equipment which would be 
required with five-hole punching, it was felt, would not 
be in enough demand to warrant the cost, even though 
the process worked satisfactorily. It was decided to 
revert to the standard punch sorter with three punches 
and regular IBM transcriber with standard IBM punch 
codes, but to continue using the Luhn scanner re- 
designed for greater flexibility in searching for complex 
relationships. It was reported that the new scanner 
was to be delivered to Perry T s group at the Battelle 
Institute early in 1955 (29). 

Shaw states that: 

This attempt to sort over the total area of the 
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card was not very successful because of the 
mechanical problems in handling the card so 
as to control the passage of light, with close 
enough tolerance between the punched card and 
the inquiry card to make this effective. The 
machine also provided only about one- half of 
one per cent of the speed of the Rapid Select- 
or. It is no longer in process of development 
(18). 

The FID Manual also describes the system and 
states that a second model was to be built "which may 
prove to be the production prototype" (30). Shaw, in 
1956, stated that the machine was an experimental mod- 
el developed by Luhn "on which development work was 
discontinued more than a year ago" (31). 

A report from the Welch Medical Library Indexing 
Project stated that they ran a trial run on the IBM Pho- 
toelectric Scanner in which 140 articles were indexed 
and coded. 

The results of the trial run were highly satis- 
factory. This machine seemed to have great 
possibilities and we would have liked to have 
had more experience with it (32). 

Since machines were not available, all future work on 
this project was done on the IBM 101 (31). 

The principles of the Luhn scanner have also been 
described by Taube in introducing the principles to be 
used in his "COMAC 5 " or "Continuous Multiple Access 
Collator, " described later (33). 

The Bush Patent Office Report refers to the Luhn 
system as the IBM X-794 and states that it: 

is a special machine which utilizes a machine 
scanning code of 792 characters that is record- 
ed on standard punched cards. . . . Questions 
can be combined, up to a limit of 72 charact- 
ers, in various logical combinations such as 
any one or more, any three or more, and all 
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of one or more except certain specified codes. 
Cards responding to the question are automat- 
ically selected and accumulated, and, where 
several questions have been combined, the ac- 
cumulated cards are then separately segregated 
for each question. Operating speeds, for the 
simultaneous search of the entire card, are 
expected to be about 1, 000 cards per minute 
(34). 

Taube points out that even at 1, 000 cards per minute, 
it would require 16 1/2 hours to search a million items 
to answer one two-termed question. He states that be- 
cause of this: 

a searching system^even when so advanced a 
device as a Luhn scanner, can only be used for 
relatively small collections or for collections 
which permit the division of items into mutual- 
ly exclusive classes, each one of which is 
small enough to make searching the total class 
practical (35). 

He states that: 

The great advance of the Luhn Scanner was its 
demonstration that free field coding could be 
used with punched cards and that one card 
could constitute the question which interrogated 
the store on other cards (36). 

Continuous Multiple Access Collator 

Taube has recently proposed a new machine, which 
he called the COMAC, or Continuous Multiple Access 
Collator, based on the principles of the Luhn scanner 
but which will be efficient enough for use with a collec- 
tion of a million items (37). The system to be used is 
like that of the Luhn scanner except that collation as in 
coordinate indexing is to be used instead of linear 
searching (38). 

The COMAC operates under the principle of match- 
ing codes on one punched card, against codes on another 
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punched card and punching the code for the logical pro- 
duct code on a third card. This card can then be col- 
lated with a third subject card and the final answer 
printed rather than punched. Thus, there is no refil- 
ing necessary. A special code known as the "Chinese 
binary" which allows numbers from 1 to 999, 999 to be 
punched in two columns of the standard IBM, card is 
used. By using 74 columns, thirty- seven item numbers 
can be punched on a single subject card (39). With a 
collection of one million items and an average of 20 
descriptor terms per item, assuming 10, 000 terms are 
used, the 20 million items resulting could be punched 
on 540, 540 cards. The 540, 540 cards, in this system, 
would be organized into 10, 000 groups, one for each 
subject, averaging 54 cards to a group. Using the Uni- 
term system, a two- term question could be answered 
by comparing two 54-card groups. Cards could be add- 
ed at any time without "dedicating space n for them in 
the group (unlike Minicards). It is stated that the 3 
million patents in the Patent Office could be handled on 
approximately 1, 600, 000 cards, and again assuming a 
10, 000-word vocabulary, they would average 162 cards 
per group (40). The procedure in collating would be 
to advance the cards endwise two columns at a time 
(41). 

It is stated: 

We have not attempted in this paper to describe 
the Comae apparatus. However, from our 
studies of existing punched card equipment, 
binary to decimal' converter s, comparators, 
etc. , it appears that once the basic concept of 
the Comae is accepted, the construction of a 
device for single code comparison represents 
only a very modest development effort. Actual- 
ly the character of the physical equipment nec- 
essary is practically deducible from the new 
concept of collation as a matching and print- 
out process of item codes rather than a card 
selection and interfiling process (42). 

It is assumed that cards can be advanced two col- 
umns at a time and compared in the COMAC at about 
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twice the rate they can be advanced in existing card 
reproducers that feed cards the long way. With 
COMAC cards containing 37 codes, the 54 cards con- 
taining 2, 000 codes could be read in 100 to 150 seconds. 
By doubling this figure to allow for intermittent ad- 
vance of the two groups of cards, it is estimated that 
it would take three to five minutes to search the aver- 
age two-termed question. This is compared with 16 1/2 
hours for the Luhn scanner in a similar situation. It 
is expected that in any sizable installation there would 
be a number of COMAC machines available so that num- 
erous searches could be carried on at the same time. 
Thus, if there were five COMACS available, they could 
answer one question per minute (43). * 

The COMAC was conceived under a research con- 
tract with the Air Force Office of Scientific Research 
(44). 

Addendum: The machine version of the COMAC, 
hereafter known as the IBM 9900 

Special Index Analyzer (IBM 9900) is described 
in an IBM publication (45). The machine is composed 
of 3 units: 

A modified IBM 36 Card Punch which is 
used for reading cards; 

A logical and intermediate storage unit 
which contains both the control equipment, a 
paper tape punch, and a paper tape reader for 
retaining the intermediate results of operations; 

A typewriter for automatically printing the 
results of the search (46). 

The basic storage unit of the system is an IBM 



*There is no indication in this report of the type of hard- 
ware to be used or whether the collating is to be done by 
mechanical systems. Since it is described as a develop- 
ment of the Luhn machine, it is placed here with photo- 
electric devices rather than with strictly electronic search- 
ing systems. 
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card. Each card contains the following information: 
the encoded name of the descriptor, the sequence num- 
ber of a descriptor card in a descriptor deck, and the 
encoded document serial numbers. Each document ser- 
ial number is encoded on 6 columns of the card. A 
maximum of 12 document numbers can be encoded on a 
card; the remainder of the space is needed for descript- 
or card sequence number (47). (Taube, as we have 
seen, put up to 37 document serial numbers onto one 
descriptor card (48). ) 

New documents are added to the system by assign- 
ing them the next available serial number and adding 
this serial number onto the pertinent descriptor cards. 
One card is made for each descriptor which applies. 
The descriptor code and the document serial number 
are added to each punched card. An x punch is punched 
in column 1 to identify the card as a new document 
number card. The new document number cards are 
added to the regular descriptor deck by means of a log- 
ical sum operation in the IBM 9900 (49). 

In an IBM 9900 search, 2 decks of descriptor 
cards are matched for common document serial num- 
bers. The first descriptor deck has to be reproduced 
on paper tape before it can be combined with the sec- 
ond descriptor deck, which is on punched cards (50). 
Searches can be made for logical products, sums, and 
differences (51). Only 2 decks are matched at any one 
time. The results of searchesthe matching document 
serial numbers- -appear on paper tape. The numbers 
are then printed out on a form. The search time for 
a typical search is indicated. A logical product search 
with 3 descriptors which include 300, 240, and 360 doc- 
uments respectively requires about 10 minutes machine 
time (52). If we assume that the average descriptor in 
this collection includes 300 documents and if we use an 
average of 20 descriptors per document and a descript- 
or vocabulary of 10,000 descriptor s--Taube f s figures 
(53)-- then we can calculate the approximate size of 
such a file. 
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Average number of documents 

per descriptor = Total number of documents x average 
number of descriptors per document 
Total number of descriptors 

or 

Total number of documents = Total number of descriptors 

x average number of documents 

per descriptor 

Average number of descriptors 
per document 

Total number of documents = 10, OOP x 300 = 150, 000 doc- 

20 uments 

A 3 descriptor logical product search for a file of 
about 150,000 documents takes about 10 minutes of ma- 
chine time, as compared to Taube's figure of 3 to 5 
minutes for a 2-term search of a file of 1, 000, 000 doc- 
uments (54). 

Use: The IBM 9900 was demonstrated at the In- 
ternational Conference on Scientific Information in Wash- 
ington, D. C., November, 1958. No reports of the 
use of the machine are as yet available. 



Part 2. The Rapid Selector 
Early Development 

According to Shaw (55), the first practical applica- 
tion of electronics to selection of data on film was 
probably that of Dr. E. Goldberg of Germany as re- 
vealed in the U.S. patent granted 29 December 1931. 
Goldberg 1 s patent was applied for on 5 April 1928 (56). 
It claims: 

A process of carrying out adding, sorting, sta- 
tistical and like operations which consist in ex- 
ploring indications upon a search element com- 
prising a search plate and a record element 
comprising a record card or strip and causing 
the radiating energy to actuate a recorder 
when the explored indications upon the search 
plate and record element are identical, the in- 
dications of one of said elements being penetra- 
ble by the rays and the indications on the oth- 
er element being impenetrable by the rays (56). 

Dr. Vannevar Bush of the Massachusetts Institute 
of Technology is credited with developing the basic 
principles of organization of knowledge applied in the 
Rapid Selector and the basic electronic system involved 
(55, 57, 58, 59). An experimental machine known as the 
Tt Bush Rapid Selector" was worked on at the Massachu- 
setts Institute of Technology and was announced there in 
1940 (60,61). 

The Bush Rapid Selector: 

made selections of particular bits of data from 
a checkerboard of light and dark squares on 
each frame of film that formed a code. Each 
square had to have a photoelectric cell to mon- 
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itor it. When a particular subject was wanted 
the machine was set for a certain pattern and 
when all cells in that pattern received light 
impulses, there was the desired item. 

The drawback to this system was the num- 
ber of photoelectric cells required (57). 

The machine was designed for abstracts of much great- 
er brevity than used in the later Rapid Selector. "The 
file speed was low enough with respect to the rate of 
advance of the recopying camera to obviate the need 
for any slowdown on the film drive in the case of close- 
ly spaced T hits T (62). " According to Shaw (63), the prin- 
ciple used "was not operable and the selector failed to 
work. Tt 

The machine was dismantled when in World War 
II it was necessary to salvage the electronic parts used 
in the machine (57). 

After the war a number of interested persons 
again took up the problem of the Rapid Select- 
or. Among them were Ralph Shaw, Librarian 
of the Department of Agriculture, and a group 
of persons who had originally worked with Dr. 
Bush, and were now independently organized as 
Engineering Research Associates, with head- 
quarters in St. Paul (57). 

The need for finding more efficient methods of 
handling reports was of particular concern to the Office 
of Technical Services and the Department of Commerce. 
The problems were discussed with Dr. Vannevar Bush, 
and Ralph Shaw and the Engineering Research Associates 
proposed a development contract on the Rapid Selector. 
The Department of Commerce approved this contract 
under the Office of Technical Services in 1947 and 
$75, 000 was allocated for the work. Engineering Re- 
search Associates were to produce the prototype ma- 
chine under the direction of Ralph R* Shaw, who was also 
to prepare information for and to test the machine (10) 
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Operation 

The fundamental principle of operation of the Rap- 
id Selector is the application of a photocell to deter- 
mine when a desired code, made up of white and black 
spots, matches codes identifying abstracts of articles 
filmed on 35 mm. motion picture film (65, 66). The 
Rapid Selector used 2,000-foot rolls of 35 mm. film 
containing approximately 100, 000 pages of text or ab- 
stract, and space for 600, 000 coded index entries (63), 

The information is stored on one-half of a 
standard 35 mm. frame and the binary coding 
representing its subjects is stored in the form 
of opaque dots photographically produced on 
the other half of the 35 mm. frame. Holes 
representing the subject looked for are punched 
into an opaque mask, and the film is run over 
this mask. Light passing through the film and 
the mask falls on a photocell and as long as 
light reaches the photocell nothing happens. 
When the black dots on the film match all the 
holes in the interrogator mask, no light reach- 
es the photocell for a tiny fraction of a second, 
this permits the stroboscopic camera to oper- 
ate and a projection print of the frame of text 
associated with the code dots is made while 
the film keeps moving along and continues its 
search operation. The stroboscopic camera 
developed for the rapid selector makes its 
picture in two-millionths of a second so that 
there is no need to stop or slow down the film. 
Thus searching is done at the rate of 120, 000 
choices per minute and copies of pertinent pag- 
es are made as the searching is done. The 
machine was designed to use available auxiliary 
equipment; the width of the image on the take- 
off film is 0. 9 in. so that the film taken from 
the selector may either be read in a micro- 
film enlarger or run through an automatic 35 
mm. V-mail type enlarger and converted into 
full-size paper copies at the rate of about 500 
full-size enlargements in seven minutes (63). 
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When frames which are to be photographed are 
too close together, the mechanism which moves the 
next frame into position for exposure cannot move 
quickly enough to have it ready in time to photograph. 
This problem was solved using an anticipatory head or 
photoelectric scanner which anticipates the approach of 
a frame which is too close to be photographed at high 
speed. This device slows down the speed of operation 
from 300 feet per minute to 50 feet per minute to 
allow photography of the second "hit. rr After the sec- 
ond picture is taken, the machine resumes its normal 
speed (67, 68, 69). 

Descriptions of the recording camera and auxiliar- 
ies, optical systems, electrical control, and mechanical 
design can be found in the Engineering Research Asso- 
ciates 1 "Report for the Microfilm Selector" (70), and in 
an article published in Electronics (71). 

After the original Rapid Selector was developed, 
a high speed intermittent camera was developed, fol- 
lowing a suggestion of Vannevar Bush. The mechanism 
was developed by Ralph Shaw and a group from the 
National Bureau of Standards headed by Jack Rabinow, 
through funds provided by the Atomic Energy Commis- 
sion (72). 

Film Preparation 

In preparing the file: 

When the operator photographs the abstract he 
simultaneously enters, on a keyboard device, 
as many as six separate catalog descriptions 
by using the code numbers. These appear on 
checkerboard form on the film right beside the 
abstract. Photographing is a routine, rapid 
job, and can be done without professionally- 
trained help (68). 

The coding of an abstract is in the form of 
seven-digit numbers. The abstract originally 
entered on the microfilm can be described and 
classified by a maximum of six such seven- 
digit numbers. Any abstract may then be se- 
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lected by interrogating the device with any one 
of six characterizing seven-digit code numbers 
as an index (73). 

In photographing materials for the machine, the 
operator: 

has before him a book in which all the subject 
categories are listed. Beside each category a 
number has been placed. Because each refer- 
ence has room for seven digits, up to ten mil- 
lion number combinations are possible. Re- 
arrangement of film could, of course, give 
more possibilities. Therefore, several refer- 
ence systems can be adopted "bodily" by the 
machine, and used side by side; the laborious 
task of recataloging so as to adopt a single set 
of subject headings to a collection is thus 
avoided (68). 

Evaluations 

The performance characteristics reported (74) for 
the Rapid Selector were as follows: 

Speed, ft./min. : 300 

Speed, abstracts/sec. : 180 

Speed, 7-digit numbers/sec. : 1, 100 
Reel capacity, feet: 2, 000 

Reel capacity, abstracts: 72, 000 

Running time, min /reel: 6.7 

Recopying camera, frames/sec. : 30 
Minimum reel slowdown time, sec. : 0. 12 
Film passed after a slowdown 

command, inches: 4 

The Patent Office Report, in comparing the Rapid 
Selector with other devices^ stated that it could search 
in four to six minutes material which would take an 
ordinary card sorter 900 minutes and an IBM 101 type 
sorter about 72 minutes, 

It was believed that the machine would be capable 
of 500 feet per minute operating speed if 1. larger mo- 
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tors were used driving the reels, or 2. the film length 
between the anticipatory projectors and the main pro- 
jector were doubled (76). It was believed that the sys- 
tem could be used for filing and indexing letter files, 
literature of the type included in a publication such as 
Chemical Abstracts, and for improvement in the index 
in such services, and in searching of Patent Office 
files (76). The inventor stated that: 

while useful results may be achieved merely 
by using^ the machine to do more speedily and 
more efficiently what we can now do 6 . . a really 
important contribution to the advancement of 
science will result only if we can re-think the 
methods of organization of knowledge to take 
full advantage of the new technique (77). 

Shaw believed that with high reduction ratios and 
inclusion of bibliographical material only, instead of 
abstracts, the Library of Congress catalog could be 
stored on film, representing only 2 5 cubic feet of stor- 
age space and about forty minutes of running time if 
the entire catalog needed to be run. He believed that 
if the reels were mounted in cartridges of five second 
runs, it should be possible for one machine to answer 
at least twelve questions per minute or 8, 000 per day 
and that "one machine might then, if properly used, 
handle all the reference uses of the public catalog of a 
research library" (78). 

Coding and Selector Design 

While the Rapid Selector was under development, 
there was considerable discussion of coding for the ma- 
chine. Wise and Perry (79) suggested adapting coding 
system devices for key sort cards to coding for the 
Rapid Selector. This system would employ code desig- 
nations of six letters each instead of the seven digit 
ones used, thereby increasing the number of available 
code designations from 10 million to more than 300 mil- 
lion. It was proposed that these codes be superimposed 
on the Rapid Selector coding area, permitting as many 
as 16 contexts to be coded on a microfilm frame rather 
than the limitation of six. It was believed that this 
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would increase the ability to conduct searches for com- 
binations of several concepts at one time instead of 
permitting only single- concept searches. It was believed 
that this coding system would require relatively minor 
changes in the design of the Rapid Selector* 

Mooers (80) criticized this suggestion on the 
grounds that it was inefficient in its utilization of avail- 
able code space, its high rate of false drops and its 
inflexibility in requiring a constant number of code 
marks for each concept. He believed that random cod- 
ing would equalize code mark frequencies and would be 
far more satisfactory. Wise (81), in a reply to 
Mooers, defended multiple code word coding and claimed 
that word coding is flexible, does not require a code 
dictionary, can show relationships between ideas, and 
can be made random by utilizing symbols or a modified 
alphabet. 

Proposed Modifications 

It was estimated by Engineering Research Asso- 
ciates that the Selector could be duplicated for about 
$50, 000. It was also reported that the company was 
considering the development of a "junior model n applic- 
able to collections of 5, 000 to 10, 000 items. This 
model would have simpler coding and scanning and 
would not photograph the microfilm selected but would 
stop the scanning operation so that the selected frames 
of the microfilm could be viewed by the operator (82). 

According to the Patent Office Report, Rapid Se- 
lector improvements are being designed at Yale Univ- 
ersity, using the same principle of 35 mm. film with 
both code and text, but: 

the equipment is designed to be complementary 
to standard punched card equipment. The code 
area permits simultaneous search of up to 400 
columns, or the equivalent of 5 punched cards. 
Recording of the code information is achieved 
through transformation of the pattern of holes 
on a card to a pattern of lighted miniature 
bulbs that photograph as black dots. The in- 



Retrieval Systems 179 

ter rogation system uses a system of plug- in 
phototubes which provide greater latitude in 
the selection of coded data by searching any 
pattern involving any columns or by searching 
a range of codes within a column, if desiredo 

The microfilm selector equipment, then, 
has definite advantages for searches where the 
desired result is the facsimile reproduction of 
text, drawings and other graphic material, or 
abstracts. Its chief disadvantages are that the 
medium requires a serial selection, that re- 
production is a separate process of develop- 
ment, that the area for coded selection is lim- 
ited by the space required for the text mater- 
ial, that the coded entries cannot be changed 
or added to without recopying or splicing the 
entire roll, and that the roll imposes a fixed 
physical grouping of the material which must 
be determined at the time of recording (75). 

The same report states that: 

In the Yale version of the Rapid Selector, a 
magnetic recording tract strips of the film 
has been suggested to carry pulses produced 
by recognition signals from the Interrogator 
Unit in order to activate the recopy camera. 
These pulses would be retained on the film 
after the interrogation run. This technique 
would make possible the transfer of the film 
to a separate printer so that rearrangements 
of selected frames could be made up into sep- 
arate reels as desired. It is certainly possible 
that the portion of film now reserved for op- 
tically sensed recording of coded index entries 
could be replaced by an area for magnetic re- 
cording of similar selection information that 
could be more readily revised (83). 

In 1956 Shaw discussed the Rapid Selector and 
made proposals for its improvement. He pointed out 
that in spite of the fact that the Rapid Selector used 
imput rolls of 2,000 feet of film containing approximate- 
ly 100 thousand pages of text and 600 thousand index 
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entries in a quarter of a cubic foot of space, the input 
unit required four minutes of running time plus another 
minute or two of reel change time so that only ten 
reels could be run per hour and that if even only one 
reel could be used per inquiry, only 48 questions could 
be answered per day despite the high speed operation 
of the electronic parts of the Selector. He pointed out 
that the lack of balance in input, internal running speed, 
and output of the machine invalidates the value of the 
speed of the electronic selector. He believed that, 
TT the basic error was the assumption that we could run 
fast enough to avoid preclassification n (84). He sug- 
gested the use of preclassification, using 50-foot car- 
tridges instead of 2, 000-foot rolls, suggesting that the 
search time could then be changed from 6-minute units 
to half-minute or one-minute units. He stated that, 
"This requires additional development work but the 
principle has been established and there appear to be 
no special difficulties about this development" (85). 

Shaw stated that modifications have been made in 
the Selector which have eliminated about two-thirds of 
the electronic equipment and half the optics required in 
the first model and that these have provided for comple- 
mentary as well as direct coding. He stated that: 

the product obtained from the rapid selector 
was adequate for use even in its first stage, 
and the cost of the machine appears to be one 
that can be brought within the range of any of 
the research libraries that have need for any 
mechanical devices or systems. About seven- 
ty thousand dollars were spent on the first 
prototype; about fifteen thousand dollars addi- 
tional were spent in simplifying the mechanism 
and eliminating unnecessary elements under 
new theories developed from operation of the 
machine. In its present form the machine 
could probably be duplicated for about $10, 000 
to $12,000, but enough additional knowledge 
has been accumulated in the course of exper- 
imentation with the machine in operation that 
it appears probable that another $40, 000 to 
$50,000 invested in development work should 
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yield a production model of the machine that 
could be duplicated for around $20, 000, and 
thus would not only be much faster and would 
supply full cycle response to questions but 
would be cheaper than any of the other ma- 
chines available or in process of development. 
The development of a new rapid selector should 
be based on high reduction ratio microfilming, 
at least at the 60-diameter level, so as to 
carry the full article in the single frame op- 
posite each set of code dots 6 Probably slit 
photography should be substituted for flash 
photography because that would reduce the a- 
mount of electronic equipment required and 
would increase the reliability of the machine 
still further, while reducing the number of el- 
ements that might require maintenance work 
(85). 

According to a recent report, work is currently 
being done on the Rapid Selector at the Bureau of 
Standards. 

As part of the investigation of various infor- 
mation retrieval methods and machines, the 
Division is attempting to evaluate the practi- 
cability of the Bush-type Rapid Selector. Some 
of the opto-mechanical parts of the third ma- 
chine were obtained from Yale University as a 
basis for further work To overcome some of 
the difficulties found in the earlier' machines, 
certain recent advances in electronic computer 
technology and in information handling are being 
incorporated in the equipment. The laboratory 
has built a small interrogator and comparator 
for use with the modified Yale film transport 
and copying equipment. The system will use 
binary code words of 40 bits each and as many 
words are required, the limit being set by the 
amount of electronics desired. At the present 
speed of operation, the equivalent of 240 
punched cards' worth of code information can 
be searched and copied per second. A test 
film has been made containing about 1, 000 



182 State of the Library Art 

articles from the NBS Technical News Bulletin. 
The machine master film has been copied on 
mylar base material which has far greater 
strength than the acetate base materials. Some 
test loops .have passed through the machine 
many thousands of times at 60 n /sec. without 
breaking. The individual parts of the system 
have been checked out; debugging the entire 
system for reliable operation is in process 
(86). 

Shaw had pointed out early in the development of 
the Rapid Selector that, T 'while substantially unlimited 
in storage capacity, in speed of reproduction and in 
range of selectivity, (it) is limited as to the number of 
different transactions it can carry out in a given time 
from one set of instructions TT (87). 

Bedford, in reviewing machine systems for hand- 
ling information, in 1956 stated that: 

Microfilm is high in storage capacity, but the 
convenience of the unit record and ease of 
manipulation is sacrificed. The retrieval pro- 
cess is not complete until the second film is 
developed and the enlarged print is secured. 
The Selector is undoubtedly an excellent me- 
chanical system for fairly static historical col- 
lections, but it is not feasible for rapid access 
to a rapidly growing diverse collection subject 
to shifts in requirements (88). 

Vickery, in commenting on the Rapid Selector, stressed 
that high-speed selection devices perform only a por- 
tion of the total process of information retrieval, that 
of scanning symbols. He believed that when the other 
factors are considered, the reductions of storage space 
and searching time achieved by the Rapid Selector are 
bound to be considerably less than is claimed, and that 
significant advantages of speed occur only for searches 
yielding hundreds of references. He opposes the idea 
that the great searching speed in the machine solves in 
any way the intellectual problems of indexing, classify- 
ing and coding (89). 
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Patents 

The patents which have been granted for the Rap- 
id Selector include, "System and Apparatus for Select- 
ive Photographing" (Ralph R. Shaw) (90) which covers 
various aspects of the Rapid Selector apparatus and op- 
eration; "Means for Eliminating Interference Between 
the Optical Trains of a Photographic Reproducing Ap- 
paratus" (Lawrence R. Steinhardt) (91) which describes 
a device which would prevent fogging of the copy film 
in the Rapid Selector by having the light coming through 
the code area of the master microfilm be of a different 
wave length than that shining through the text area, mak- 
ing possible better shielding of the copy film by use of 
light wave length filtering; and "Photographic Apparatus" 
(Ralph R. Shaw) (92) which describes the Rapid Select- 
or camera used for photographing various page sizes on- 
to the text portion of the microfilm while simultaneous- 
ly photographing the code pattern of a fixed size onto 
the code portion of the microfilm. 

A number of technical reports, in addition to the 
one quoted above (58), have been issued by Engineering 
Research Associates (93,94,95)* Additional references 
on the Rapid Selector are listed in the bibliography (96- 
103). The Rapid Selector has also been described brief- 
ly in a large number of more general publications. 



Part 3,, Filmorex 

The Filmorex System has been developed by 
Jacques Samain in Paris and has been described in a 
number of pamphlets (104, 105, 106) and articles (107-111). 
The system uses rectangular pieces of microfilm 72 x 45 
mm. which are divided into two sections, one section 
containing the code which is searched using photoelec- 
tric cells and patterns to match the codes desired, and 
the second section containing the document or an ab- 
stract thereof. 

The Film and its Preparation 

The Filmorex card is a heavy weight acetate 
sheet coated with photographic emulsion, 45 x 72 mm. 
(112). (The current card size is described as 70 x 45 
mm. in at least one reference (113), and 60 x 35 mm. in 
two references (114, 115)). The Filmorex cards are pre- 
pared from a continuous roll of film thirty meters long 
(113 ) A double lens camera with a special keyboard is 
used to photograph the two sections of the film. After 
development, the film is cut into the card size. If ad- 
ditional copies are wanted, they can be reproduced 
from the film prior to cutting (116). 

Coding 

The coding area is twenty columns with each col- 
umn capable of recording a five- digit number. Each 
digit is recorded as a pattern of two opaque dots and 
three transparent squares (117). In another description, 
the coding area of the card is divided into 25 parallel 
columns, each divided into six groups of six positions, 
making it possible to record a six-figure number on 
each line and 25 six-figure numbers in the entire cod- 
ing zone (118). 
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Selection 

Selection is done by passing the cards through a 
photoelectric scanning device which examines each line 
of the coded area consecutively. Whenever the desired 
codes are encountered, the scanning device operates 
an apparatus which sends the card into a special pock- 
et. Those cards which do not possess the desired 
codes are sent into a second pocket (119). The selector 
reads 600 cards per minute (4, 000 digits per second) 
(120). One reference indicated 700 cards per minute 
(119), and Shaw states that the running time is 400 cards 
per minute, theoretical speed (112). Samain states that 
difficulties lie in the smallness of the code squares, 
which are approximately one square millimeter, and 
the speed of passage, which is approximately 1, 000 
impulses per second, and because of the irregularities 
of the film. Reliability is achieved by having five read- 
ing stations each with a photocell, an amplifier and a 
thyratron controlling a relay (121). The system will 
work with the microfilm cards in random order (119) or 
the selector itself can be used to sort the cards in any 
desired order by successive passes. Duplication of 
cards into classified groups will speed the selection 
process by making it necessary to pass fewer cards 
through the selector (122) u 

Use of Filmorex Cards 

The document or abstract can be read directly 
with a standard microfilm reader or it can be repro- 
duced or enlarged by standard processing methods (119). 
Since the cards are not reproduced automatically, the 
original cards have to be taken from the file for use 
so that it cannot be used for searching until the cards 
are refiled (112). 

Use of the System 

It is stated that: 

As of the present time, all biological papers 
abstracted in the T 'Bulletin Analytique n have 
been encoded (123). 
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In the same paper it is stated that the Bulletin Signal- 
etique, which publishes about 130,000 abstracts a year 
in the fields of mathematics, physics, chemistry, and 
biology, is using the system. It is stated that: 

Instead of preparing an index, like Chemical 
Abstracts, we decided to initiate in 1954 a 
bibliographic research device which provides 
information with regard to questions posed by 
searchers (124). 

Coblans states that Samain has been experiment- 
ing since 1954 at the Centre National de la Recherche 
Scientifique (CNRS), Paris, with the Filmorex system. 

For the purposes of the CNRS each entry in 
the Bulletin Signaletique could be transferred 
to a micro- sheet with coding of all of its sub- 
ject aspects, author, periodical, etc. , and the 
reference number of the abstract in the bull- 
etin. In this way a list of all reference num- 
bers of abstracts in the bulletin on a requested 
subject for a specified period of time could be 
supplied as part of its service to subscribers 
at a nominal charge (125). 

Coblans states that the mechanical aspects of recording 
and selecting can be done but that the major problem is 
classification, n the elaboration of a system viable over 
the whole gamut of knowledge and enabling selection of 
all abstracts relating to a definite subject" (125). 

Filmorex Claims 

It is claimed that the advantages of the Filmorex 
system are 1. the cards are very strong and can pass 
through the selector many thousands of times, 2. the 
cards are small but hold much information, 3. the se- 
lection process is simple and almost fully automatic, 
4. the selection speed is very high, and 5. the cost of 
the equipment is relatively low (126). It is stated that 
a decimal classification system is used but that the cod- 
ing system is not rigid (123). It is stated that the se- 
lector can search for various logical combinations of 
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five ideas at one time (121). 



Part 4. Minicards 

The Minicard system for documentary records 
and control was first described at a meeting of the 
American Documentation Institute in November 1954, 
and in published form in January 1955 (127). A later re- 
port described the equipment used in detail including 
photographs, tables of operating rates and flow charts 
of operation, and described the filing and searching op- 
erations of a hypothetical collection of one million Min- 
icards (128). The developers of the system believe that 
it will have broad application and that it will be useful 
for computer and business applications as well as for 
handling all types of documentary information (129). 

The Minicard 

Minicards are small pieces of photographic film, 
16 x 32 mm. in size. Near one end of the piece of 
film, a slot is provided which makes it possible to 
handle the Minicards on metal sticks (130). 

The space of the Minicard is divided up between 
code areas and image areas. The image areas, of 
which there can be a maximum of twelve, each record 
the equivalent of a legal size page. When twelve im- 
age areas are used, there is some space for coded in- 
formation but as more space is used for coding, less 
is available for text (130). The reproduction in the im- 
age areas of the Minicard is done at a reduction ratio 
of 60 to 1. Thus, the Minicard has a much higher 
storage capacity than microfilm (131). In the earlier re- 
port, the maximum digital information capacity of the 
Minicard when it carries no graphic images is given as 
70 columns of 42 bits each, or, 2940 bits in all(130). In 
the later report, the code capacity is specified as 66 
columns of 43 binary digits each, which are used to 
make up 7 six-bit characters plus one "parity checking" 
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bit. One of the 7 characters in each column may be 
used as a !T tag n to signal the kind of code of the other 
6 characters in the column. A boundary signal may be 
entered in the open field of the card to indicate linkages 
or relationships between codes (132). Duplicate Minicards 
are made for each significant code entry in order to re- 
duce searching time (133). 

Minicards are handled on sticks which are used 
in presenting the Minicards to the machine and for en- 
tering and removing the cards from the files. The 
sticks have a capacity of 2,000 Minicards (130). The 
sticks of Minicards are combined into file units con- 
sisting of either 10 or 100 magazine units (134). 

It is claimed that Minicards have roughly the 
same cost, card per card, as punched cards, but on 
the basis of digits or bits per card, "Minicards have 
a cost advantage over punched cards which amounts to 
a factor of many times n (135). 

Preparation of the Minicard 

In making Minicards, the camera performs two 
basic functions: 1. the exposure of code patterns and 
2. the exposure of document images. The film in the 
camera is in roll form which is cut into separate Mini- 
cards after processing. The code, which is in the 
form of alpha-numeric characters, is punched onto pa- 
per tape. The paper tape is used to control the cam- 
era which then exposes the code pattern automatically. 
It is also possible to enter the code directly from a 
keyboard to the camera or to enter it from punched 
cards or magnetic tape. After the codes have been 
exposed, the document pages are exposed to the film 
(136). The camera can record forty to fifty 6-page doc- 
uments and their codes per hour (137). This operation 
requires an operator who positions the documents. Oth- 
erwise the operation is largely automatic. Both line 
and continuous tone copy can be reproduced. The film 
is processed in roll form and then passed through a 
"film chopper" which cuts the film into individual Mini- 
cards, and stacks them on the Minicard stick. In dup- 
licating Minicards, a contact printing procedure is used 
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with a roll of film. Additional coding can be entered 
during the duplication if desired (136). The negative 
Minicards are used to produce positive cards with code 
designations as required. The negative cards are put 
into a master file, which may be arranged by acces - 
sion number, and the positive cards go into the working 
file used for selection of material (138). 

Minicard Sorting and Selection 

The Minicard system contains equipment for sort- 
ing the Minicards and distributing them to their correct 
file locations. For normal sorting, the cards to be 
filed: 

are fed one at a time past a reading station 
which sorts on one digit at a time in a desig- 
nated code column, the cards being directed to 
one of ten receiving magazines. Sorting can 
be done on both numeric and alphabetic char- 
acters (139). 

A "fine sorter 1 ' has two sets of ten receiving mag- 
azines, each with a reading station preceding it, ar- 
ranged in a closed circle. The Minicards are trans- 
ported around the circle so that it is possible "to pro- 
gram this machine to sort successively to any number 
of digits without attention by an operator" (140). In a 
"locked sorter" the reading stations and magazines 
function in the same manner but are moved by a linear 
transport mechanism. With this sorter, file magazines 
can be attached to it and sorting is done directly into 
the file. In the earlier report it is stated that sorting 
and selecting can be done at the rate of 1, 800 cards 
per minute (130). The later report lists the scanning 
rate for the filing sorter as 1, 000 cards per minute 
and the scanning rate for selection as 1, 200 cards per 
minute (137). 

In the selection of Minicards, questions are coded 
and Flexowriter tapes are prepared for use on the se- 
lector with limits of 20 words per question. Plug- 
boards are used for logical relations and conditions (141). 
It is also possible to set up boundary specifications for 
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recognition of "less than" or "greater than" (134, 142). 
The Minicards from the storage magazine pass a read- 
ing head where the data is read from the Minicard 
code field (143). In the selector, the Minicar^ code field 
passes a fixed array of 43 photocell detectors (134). The 
data is examined in the electronic circuit of the select- 
or and, when the data is recognized as satisfying spec- 
ifications of the question asked in the selector, the 
Minicard is directed into a separate receiving maga- 
zine. Minicards not selected pass into a different 
magazine (134). Output of the system is in the form of 
duplicate Minicards or full-size prints of the selected 
master Minicards, the selected cards themselves never 
leaving the system but being returned, after reproduc- 
tion, to the working file (144). 

Thus the Minicards will be used either by means 
of viewers of various types, from which the text on the 
Minicard can be read directly, or through enlargements 
provided by automatic enlarging devices (145). The user 
in either case keeps the copies which have been repro- 
duced for him (144). The duplicator produces Minicards 
at the rate of 120 per minute and the enlarger proces- 
sor makes 300 prints per hour (137). 

Minicard System Claims 

It is claimed that the Minicard system has the 
following properties important for documentation: 

1. The system handles graphic and digital .infor- 

mation in one record medium. The Maiii- 
card, a discrete record unit, has a high 
information capacity and a high activity cap- 
ability. 

2. The system has an efficient record duplicating 

capability, 

3. Document information in the system may be 

delivered directly. 

4. The system provides the input-output conven- 

ience and file space advantages required for 
large files. 

5. The system has capabilities for organizing files 

by machine. This makes possible relatively 
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short searching time for large files. 
6. The system has search capability to satisfy 
the requirements of present information 
systems (146). 

Minicard Development 

On 30 June 1955, a demonstration of T 'prototype 
Minicard equipment" was held for consultants and rep- 
resentatives of government agencies in Rochester, New 
York. At that time it was indicated that the equipment 
was scheduled for delivery for use by the Air Force 
early in 1956. 

The various items of equipment demonstrated 
included: 

1. A step-type camera for recording previous- 

ly prepared codes and documents on a con- 
tinuous film. 

2. A processor to develop exposed film. 

3. A film cutter, for preparing Minicards from 

the continuous film after development. 

4. A Minicard sorter for simple sorting of 

Minicards. This unit performs operations 
similar to those of conventional punched- 
card sorting equipment. 

5. A Minicard scanner, or a prototype ma- 

chine was used to illustrate some of the 
operations that this unit will be able to 
perform once design and construction work 
have been completed (147). 

Lewis and Offenhauser in 1956 stated that: 

Many of the details of the Minicard system 
are still under study; this is true especially 
of the specialized sorting and classifying 
equipment associated with its coding aspects. 
The photographic aspects. . . appear to be sub- 
stantially complete (148). 

In 1957 Eastman Kodak representatives stated that 
the Minicard system had not yet been tested in full- 
scale operation, that it is still in a development phase 
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and that no decisions have been made as to making 
Minicard equipment available commercially (149). 

A report by Hawken states that: 

Minicard has made considerable progress in 
the last year. One complete unit has been in- 
stalled in Washington and is undergoing testing. 
Two other units have been delivered to other 
government agencies. My informant, Dr. Feld- 
man, stated the first unit would be used in in- 
telligence work (probably Pentagon). He was 
not at liberty to disclose the destination of the 
other two units. A fourth unit to be completed 
this summer will be retained by Eastman at 
Rochester and will be available for experiment- 
al work to organizations interested in the po- 
tential use of Minicard for special applications 
of their own. 

The extent to which the physical, chemical, 
optical, sensitometric, electronic, and me- 
chanical problems of Minicard operation have 
been solved is most impressive. More and 
more work is being done to make the system 
more flexible and capable of handling more 
sophisticated problems. But Dr. Feldman ad- 
mitted that the main problems in the effective 
use of the Minicard system are those of clas- 
sification common to all systems, on which 
coding and ultimate retrieval depend (150). 

Cost of the System 

Published accounts by the developers of the sys- 
tem do not indicate costs of equipment or operation. 
Shaw, in 1956, stated that: 

Something on the order of one and one-half 
million dollars has been spent or allocated to 
this development program and it is not yet op- 
erational. 

It has been determined that a second ma- 
chine can be built during the process of devel- 
oping the first one for $350, 000 and that if the 



194 State of the Library Art 

machines were produced in batches of 100 
complete sets, the price of the production run 
would be of the order of $150,000 per instal- 
lation (151). 

Evaluations 

Shaw points out that: 

This system is based on the theory that the 
Minicards selected will be reproduced by con- 
tact printing and then will be restored to their 
old stick, so again the material is not avail- 
able for immediate researching (151). 

Taube, under a contract for the Office of Naval 
Research, has made what he calls Tt A Case Study in 
Document Storage and Retrieval" of the Minicard sys- 
tem. He admits that his analysis could not be checked 
against actual test results since the system had not yet 
reached a full operational stage. His analysis indicates 
that the high reading speed and compact storage of the 
Minicard system are greatly weakened by the systems 
of coding and filing employed. The selector rate of 
1, 800 Minicards per minute would require over nine 
hours for a sequential search of a file of one million 
items, thus necessitating duplication and prefiling if the 
operation is to be economical. In prefiling, he points 
out that at least one file magazine (a stick with a 2, 000 
Minicard capacity) must be dedicated to each term or 
class in the system. Otherwise, duplicates in the out- 
put will result from filing Minicards for more than one 
class on one stick. The unequal loading of the mini- 
sticks in having unused space for lightly used terms 
and requiring extra magazines for heavily used terms, 
thereby creates problems for interfiling and searching. 
The hundred magazine file units require only one cubic 
foot of storage space, but their weight of fifty to one 
hundred pounds which must be transported to and from 
the selector prevents their being stacked, thereby drast- 
ically reducing the claimed space saving. His calcula- 
tions indicate that the Minicard files and machines will 
occupy about as much space as the original file. He 
points out another shortcoming in the placing of both 
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text and code on the Minicard, which requires the out- 
put of one search to be refiled before the next search 
can begin and which inactivates the selector mechanism 
while the text is passing under the reading heads. He 
believes that the extensive capacities of the selector 
may be more elaborate and thereby more expensive than 
necessary to achieve a satisfactory output (152). Taube 
and Heilprin have also issued a report relating the size 
of questions to the work accomplished by a storage and 
retrieval system. Using the Rapid Selector and Mini- 
cards as examples, a study was made of sequential and 
parallel searching and the effect of multiple simultan- 
eous searching on continuous and discontinuous systems 
(153). 

Shaw states that: 

There appears little in this development in 
terms of its potential application to storing 
material in research libraries, locating it and 
reproducing it. There have, however, been 
several developments in the course of this pro- 
ject that are of considerable significance, in- 
cluding the development of 60-diameter reduc- 
tion equipment for filming and reproducing 
from film. Thfe balance of the operation is 
not much better than that of punched cards. 
Again the machine and make-ready costs are 
so great that this system will be of value only 
with highly repetitive routine processes, which, 
at least so far, cannot be anticipated in re- 
search libraries. It is a possible substitute 
for Hollerith type cards in installations that 
now use them, since it will operate somewhat 
faster than Hollerith cards and will save a 
great deal of space as compared with punched 
cards. It is, in effect, a punched card sys- 
tem using smaller cards and the Rapid Select- 
or's sorting system (151). 

Bedford has evaluated the Minicard in her review 
of systems of information and storage retrieval (154, 155). 
She states that: 
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The Minicard combines the high storage capac- 
ity of microfilm with the desirable character- 
istics of the punched card for ease of manipu- 
lation, prearrangement and sorting (156). 

She states that in terms of the criteria developed for 
this review of systems: 

The Minicard system can be rated as an ex- 
cellent method for the organization of informa- 
tion. The system is infinitely expansible in 
that it is completely open-ended and it can hold 
information to the depths required for all kinds 
of information. Any changes required to meet 
new and unforeseen developments can be made 
quickly and easily (156). 

In summary, she selects the Eastman Kodak Min- 
icard along with the International Business Machines 
electronic data-processing system as the two systems 
which have emerged from an over-view of machine lit- 
erature searching in the United States as feasible for 
installation in the project with which she was associated 
(157). 

In another project carried on by Kent, Perry and 
Egan, a study has been made of Minicard indexing. 
These reports apparently deal only with indexing and 
not with development of the Minicard system (158). 



Part 5. Miscellaneous Photoelectric Systems 
Photographic Glass Disks 

The 1954 Patent Office report states that: 

Research and development on photographically 
recorded disks storing digital information in 
binary code form is in progress at Eastman 
Kodak, International Telemeter Corporation, 
Logistics Research, Incorporated, and other 
laboratories (159). 

According to this report: 

The use of photographic glass disks is estim- 
ated to provide storage densities of the order 
of 100, 000 to 1, 000, 000 bits per square inch. 
A disk that is 6 inches in diameter might store 
10, 000, 000 bits, or the information equivalent 
of 150 feet of 35 mm. film. On a band 4 in- 
ches wide at the outer edge of a glass disk 16 
inches in diameter 20, 000, 000 bits might be 
stored, all of which would be accessible to a 
single reading station. Using a combination of 
flying spot scanning techniques and rotation of 
the disk itself at about 800 r. p. m. , it should 
be possible to read this digital information at 
data rates of up to 1, 000, 000 bits per second, 
or the information equivalent of about 2, 000 
conventional punched cards in one second. This 
rate would be approximately 3 or 4 times fast- 
er than the rates for magnetic tape so far re- 
ported. However, magnetic tapes can be writ- 
ten on, and hence revised, at the same data 
rates as for reading, whereas the writing pro- 
cess for the photographic media are much slow- 
er and involves the separate process of expo- 
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sure, developing and fixing. 

The probable access rates to photographic 
disks are very promising, since the entire 
disk might be read in about 5 seconds as a- 
gainst the 5 to 6 minutes necessary to read 
through the conventional computer tape or the 
Rapid Selector microfilm roll. In the work at 
International Telemeter, the combination of in- 
dex entries with text material has also been 
studied using the same principle of integral in- 
dex that is used for the Rapid Selector (160). 

Photographic techniques for information storage 
have been studied by International Telemeter Corpora- 
tion (161, 162). They reported that storage devices with a 
total capacity in the range of 10 7 to 10 9 bits, incorpo- 
rating favorable random access features, were possible 
and that the use of color photography could multiply the 
data storage capacity by three times. The photograph- 
ic medium would begin with punched cards. The fast- 
est system considered would employ a flying- spot scan- 
ner searching a moving storage surface such as a ro- 
tating disk or drum. The flying- spot scanner would al- 
so be used with the photographic storage. It is be- 
lieved that the combination of a small eraseable stor- 
age with a large photographic store will make available 
a device which will make it possible to apply modern 
computing and information processing techniques to 
practical non-numerical problems (161). 

Recall Film Index Systems 

The Patent Office Report also refers to a film 
index system which is closely related to minicards. 
This system, developed by Recall Incorporated, would 
use Kalfax film. This film is sensitive to ultra-violet 
light and has a heat developed emulsion which can be 
processed in normal light. The system provides film 
cards which record reduced images of the document 
and coded index entries which are produced by contact 
printing from a punched card (163). 

Film Library Instantaneous Presentation 
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The Benson-Lehner Corporation has announced an 
automatic microfilm searching machine whose purpose 
is to search for a particular frame on 16 mm. film at 
speeds of 300 to 600 frames per second and then to 
present this frame to the operator viewing. A binary 
code, in the form of black bars on a clear background, 
is used, the code utilizing 32 bits of information. The 
film reels used are 1, 200 feet long, containing 72, 000 
frames. The code capacity is eight descriptors. Lo- 
cation of the documents is by accession numbers which 
must be in numerical order on the film reel (164). 

Automatic Micro- Film Information System 

The Automatic Micro- Film Information System, 
known as AMFIS, has been under development by E. A. 
Avakian for some time. A mimeographed report dated 
2 September 1952 has been abstracted as follows: 

AMFIS is the first realistic attempt to mater- 
ialize Bush f s dream of Memex. AMFIS em- 
ploys a mechanism that can store either 1 1/2 
million documents or 9 million catalogue cards. 
By keying document accession number on a ten 
digit key-board operator can project legible 
Image onto a viewing screen at center of desk, 
or at remote points, within two seconds. Doc- 
uments are reproduced in less than half a min- 
ute by standard techniques such as photostat or 
xerography. Reports may be microfilmed on 
8 mm and/or 16 mm film. 20-inch strips 
of microfilm are inserted into a holder, which 
is scroll- like in arrangement and holds 1, 000 
such strips, which may be replaced in less 
than 30 seconds. The desk viewer Is standard 
and the remote viewer employs a fly Ing- spot 
scanner. The basic mechanical components 
were patented under the title of the Stored 
Function Calculator, U. S. Patent No. 2, 610, 
791. The unit described uses 16 scrolls but 
more could be added. Browsing is possible 
and similar in operation to standard microfilm 
readers. AMFIS undoubtedly is a unique ap- 
proach to the unidimensional character of film. 
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Its possibilities for reference and catalog de- 
partments of large libraries should be explored^ 
as well as in information storage problems 
where quick retrieval is essential (165). 

The patent referred to describes the equipment 

used for storing the microfilm slips and for projecting 
particular frames to a viewer upon request (166). 

Another report on AMFIS (167) is referred 10 by 
Lewis and Offenhauser (168). 

In a recent report, AMFIS is described as a de- 
vice for combining great reduction of storage space with 
great speed of access. It is indicated that the intent 
is not to have an information retrieval machine but a 
system that will select a given document whose identity 
and call numbers are known. By dialing the call num- 
ber of a document, the required film is moved into po- 
sition so that it can be read on a screen or a full size 
copy made* It is suggested that if a television scanner 
were substituted for the light, the document could be 
transmitted over long distances. It is stated that any 
size of film can be used and that the machine can also 
be adapted for work with micro-opaque material. It is 
stated that the capacity of one AMFIS machine is sev- 
eral million microfilm frames which may be stored 
either singly or in strips, allowing for the addition or 
deletion of frames as desired. This report states that 
the system's "final realization merely awaits concrete 
financial assistance" (169). 

Magnavox Film Data Recorder 

Magnavox has developed a film recording system 
known as the Magnavox Film Data Recorder for storage 
of digital and pictorial data together. "It involves a 
reel of microfilm either 16 mm. or 35 mm. with cod- 
ing space in selected frames. The coding space covers 
one-half of a single 16 mm. frame or about one-fourth 
of a 35 mm. frame. A total of 90 bits of information 
are stored in each coding space. ... A machine has 
been developed which can search, select, and display 
specified frames as determined from the coded area 
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information. The selected frames can be enlarged and 
copies reproduced both of the image data and of the 
coded data" (170). 



E. Punched Card Sorters 

1. IBM Electronic Statistical Machine, 
Type 101 (IBM 101) 

One of the first uses of the IBM 101 multiple col- 
umn sorter was for the Welch Medical Library indexing 
project. In their final report on machine methods for 
information searching (171) the authors indicate that IBM 
suggested the use of the IBM 101 in the fall of 1951. 
The machine was delivered early in 1952 and was used 
for several experiments at that time. Mention was 
made in this report of the use of the IBM 101 by the 
U. S. Patent Office and two publications by members 
of the U. S. Patent Office staff on machine literature 
searching, both presented in 1951 (172, 173), make refer- 
ences to a code used for a machine-based index but do 
not discuss the IBM 101 . 

In a survey published by Kent in 1958 (174) of non- 
conventional retrieval systems, 9 IBM 101 installations 
are included. These are: 

Battelle Memorial Institute 

Ciba 

DuPont* 

Merck, Sharp and Dohme (Westpoint, Pa.)* 

Proctor and Gamble 

Schering 

Smith, Kline and French* 

Socony Mobil* 

Union Carbide Chemical* 

Installations marked with an asterisk are described in 
separate publications and will be discussed below. 

Characteristics of the Machine 
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The IBM 101 can recognize and sort for one or 
more punched positions in one or more columns in one 
sort through the machine. The rate of sorting is 450 
cards per minute. The machine is programmed by 
wiring a plug board. This plug board can be supple- 
mented by a dial board to facilitate programming se- 
lected cards and can be directed into any one of 12 
pockets, the 13th pocket being for rejected cards. 

The full machine, i.e. the machine with all in- 
ternal wires connected, can sort up to 60 of the 960 
positions of the card in one sort. Sorting can be pro- 
grammed for a logical product, a logical sum, a log- 
ical difference, or any combination of these logical op- 
erations. 

Information to be sorted on the IBM 101 can be 
entered on the punched cards by direct code, by indi- 
rect code (numeric or random superimposed) and by 
free field code with some modification of the machine. 
The card passes under the reading brushes of the IBM 
101 while moving in a direction parallel to the vertical 
columns. The presence or absence of any hole or com- 
bination of holes in each of these 80 columns can be 
determined only after all 12 horizontal rows of the card 
have passed the reading brushes and effected the trans- 
ferred or non-transferred status of a series of recode 
selectors or relays. After the last row of punches in 
the card is sensed and before the next card is fed the 
relays are electrically tested for conformance with a 
preselected pattern of holes. If agreement is found the 
card is deflected into 1 of 12 pockets rather than into 
the reject pocket. 

The IBM 101 systems described up to the present 
time have merely listed descriptors; relationships 
among descriptors in particular documents are not in- 
dicated. 

It is possible to conduct more than one search at 
one time by directing search results from different 
searches to different pockets. It is also possible to 
conduct one search and to direct cards which only par- 
tially satisfy search criteria into different sorting pock- 
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ets. For example,, a card which matches one descrip- 
tor Is sorted Into pocket 1, a card which matches 2 
descriptors is sorted Into pocket 2, and so on. 

The machine has no Internal memory. It can 
perform calculations on data entered on the punched 
cards. A printing unit Is Incorporated into it. 

E. L DuPont de Nemours Company 

Chemical Department 

This machine-based indexing system for internal 
research reports had been in operation for 3 years in 
1957. A pre- installation assessment of the needs of 
the laboratory research personnel- -the primary users 
of the system indicated that an index had to answer a 
myriad of detailed questions in the chemical field and 
in bordering sciences such as biology and engineering. 
Ability to answer generic questions was also required, 
especially as applied to aspects of structure and ele- 
mentary composition of chemical compounds(175). Respon- 
sibility for the selection of subject matter to be indexed 
is shared by the author of a report and the indexer, but 
final decisions are made by the indexer. 

The index is divided into 4 parts: organic, poly- 
mer, inorganic^ and miscellaneous (natural products, 
products of unknown composition, mixtures, TT trade 
name" products, and Information which cannot be tied 
to a compound or polymer). The following sections of 
the list of indexing terms apply to all 4 parts of the 
index: 

Status of material being indexed (product, 

reactant, etc. ) 
Types of reaction 
Reaction conditions 
Properties 
End use objectives 
Modification or treatment (spinning, 

coagulation, coloring) 
Miscellaneous headings 

The index headings for structural features (functional 
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groups, ring systems, type of polymer, elements) vary 
for each of the 4 parts of the index, although some 
headings will appear in more than one part. 

A list of captions (headings) and the correspond- 
ing codes is kept on punched cards in a "supervisable" 
(visible) file (176). 

Code: Both direct and random superimposed 
codes are used. Direct codes are used in 56 columns 
for the most frequent headings (including document 
number). Random superimposed codes (4 positions 
per unit of information) are used in the remaining 24 
columns of the card. 

In addition to the detailed index to the reports, 
one card covering the major aspects of the subject 
matter is made out for each report. (It is not indica- 
ted whether this is a punched card or an index card). 

Each compound and each polymer component is 
indexed on a separate punched card. This means that 
not all the index entries for a sought report will be on 
one card. A two-step operation is thus necessary, the 
first step being the selection of cards with a particular 
code, the second step being the matching of the cards 
for common serial numbers. If there are too many 
answer cards for visual comparison of report numbers, 
they may be put in numeric sequence by machine so 
that cards belonging to the same report may be grouped 
together (177). 

Use: Frequency of use was not indicated. 

Claims or comments: It is believed that the 
framework is adaptable to most types of information 
occurring in the reports being indexed. Any part of 
the index can be expanded to accommodate the addition- 
al needs. This index does not provide a unique iden- 
tification of every compound since such completeness 
would demand a much more extensive system than is 
required for the size of installations contemplated. Us- 
age thus far has given gratifying results since very few 
unwanted cards have been selected (178). 
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Merck, Sharp and Dohme 

A machine-based index to published articles has 
been employed in the library here since 1950(179). A 
Remington Rand sorter with a multiple (12 column) 
sorting attachment was first used. However, an IBM 
sorter is now being used. Approximately 30, 000 refer- 
ences are included in the file at present The library 
service is for a group whose interests touch on nearly 
every phase of the biological, chemical, and medical 
sciences (180). A staff of 2 professionals and 3 typists 
handles the input and output of the system. 

The dictionary of indexing terms for the system 
consists of 3 parts: 

General subject (an alphabetical list of about 

1,000 terms) 

Chemicals (slightly less than 1, 000 compounds) 
Diseases (terms from the American Medical 

Association's Standard Nomenclature of 

Diseases) 

The dictionary of general subject terms contains 
cross references and instructions to the indexer for in- 
dexing under more than one level of heading, e. g. 

Children Also coded: humans 

Humans Also coded: children, when 

pertinent (181) 

A 4-position random superimposed code is used for the 
more uncommon terms; a direct code is used for the 
common terms. 

A dial board is attached to the sorter to facilitate 
programming the machine. For making special types 
of searches, a wiring system has been developed which 
will separate the 15 possible combinations which can be 
obtained from 4 descriptors into the 12 sorting pockets 
of the machine. Thus if the 4 descriptors are iden- 
tified by A, B, C, and D, ABCD would be sorted into 
pocket 12, BCD into pocket 11, ACD into pocket 10, 
ABC into pocket 1, ABD into pocket 2. . . A or B or C 
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or D into pocket 9. Cards with a single position 
punched will have to be re-sorted a second time (182). 

Mark sensing cards are used for the preparation 
of sets of index cards (183). False drops resulting 
from an interaction of the random superimposed code 
are not considered a problem. It is the goal to keep 
machine time for an "ordinary" search to 15 minutes. 
To do this, cards are divided by year and subdivided 
by types of papers, namely human clinical papers, vet- 
erinary clinical papers, experimental papers in the 
field of biology, and all other (184). 

In an earlier paper (185) Mrs. Schultz points out 
some of the advantages and disadvantages of a punched 
card system. The advantages cited are flexibility in 
various respects; cards do not have to be filed in any 
order, answers can be obtained to specific questions 
without manual selections, and questions can be asked 
by any number of combinations of terms. The cited 
disadvantages are operational: 

The ability to search the card file depends on 
the proper operation of the machine. When 
the machine breaks down the system is temp- 
orarily inoperable. . . . Some searches can be 
made more rapidly with standard indexing. To 
find all the papers of a particular author would 
mean searching the entire punched card file 
(186). 

Use: No data is given on the use of the file. 
Smith, Kline and French 

The system includes over 10, 000 published and 
unpublished documents on a single drug, chlorproma- 
zine, at Smith, Kline and French. Documents contain 
clinical and pharmacological information about the drug. 
The users of the system are scientists at the company 
laboratories (187). Highly trained scientists are not 
neec jed to operate the system, although the indexers 
must have scientific training to read the reports intel- 
ligently and to translate the author's words into code 



208 State of the Library Art 

words (188). In 1957, two years' experience with the 

system had been completed (189). 

A separate code is used for clinical and pharma- 
cological information but all data on a document is en- 
tered on one card (190). The pharmacological code con- 
sists of less than 260 index headings which are entered 
on 12 columns of the IBM card. The headings are cat- 
egorized; direct (5 columns) and free field (7 columns) 
codes are used. A portion of the pharmacology head- 
ings with an indication of the type of code used may be 
given here: 

Subject (direct code) Site of action (free field code) 

not specified, several Central nervous system 

rodents cerebrum 

birds, amphibia hypothalamus 

dog and cat brain stem 

spinal cord 

Special feature (direct code) 

unusual dose 
unusual route 
absorption 

Type of study (direct code) 

chemical 

biassay 

pharmacy 

Body system (direct code) 
metabolism of reference drug 
blood, hemopoietic 
cardiovascular 

Type of action (direct code) 
reference drug on function 
reference drug on metabolism 
reference drug on histology 

The free field code makes use of 6 punches per column. 
This aUows a total of 924 combinations per column, any 
one of which can be entered in one of the 7 columns re- 
served for this code. Any one of the combinations can 
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be entered in any one of the 7 columns. Since this 
code is sorted by pattern of holes on the card rather 
than by the location of the hole. 

No detail is given about the clinical code except 
that the code is set up with a high proportion of direct 
punches or assigned positions so that data can be cor- 
related and tabulated (191). The 6 position free code 
field can be sorted by wiring only one of the machine's 
selectors (192). 

The results of the search are on cards on which 
the corresponding document number is punched and in- 
terpreted. The cards are either examined visually or 
used to print a list of IBM 101-prepared document num- 
bers. References in the chlorpromazine file are also 
in Flexowriter tapes. Bibliographies can be prepared 
from these tapes (193). 

Use: No data is given on the use of the file. 

Claims: It is claimed that fT There are no false 
drops, though in our experience the false drop is more 
a theoretical menace than an actual one n (194). The 
authors point out that this type of index is not limited 
to an IBM 101. It could be adapted to a card index, 
they state (195). Some of the other advantages claimed 
for the system over other systems are: 

It requires no decision by the indexer about 
the relative importance of the various factors. 

The subject heading list can be enlarged or 
made more detailed without need to disturb in- 
dexing already done or to upset the entire 
classification system. 

The code was planned to be applicable to any 
area of pharmacology or physiology (196). 

Socony Mobil 

The system at Socony Mobil includes about 95, 000 
articles and patents on petroleum technology indexed 
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since January, 1952 (197). A staff of a technical per- 
sonnel selects, codes and classifies 12,000 articles 
and 4,000 patents per year. The system is primarily 
for the 330 technical employees of the research and 
development laboratory, although occasional calls arise 
from other divisions (198). 

The machine installation actually serves three 

purposes: 

1. Bibliographic cards (author, title, biblio- 

graphic citation and classification), which 
are prepared for each new reference on 
IBM cards. Weekly classified title lists 
are prepared directly from these IBM 
cards. 

2. Continuous searches, which are prepared 

on requested subjects; bibliography cards 
are prepared weekly and forwarded to the 
23 men who receive this service (199). 

3. Retrospective searches. 

Subject matter is encoded directly onto 41 columns 
of the card. The code is reproduced in a machine lit- 
erature processing demonstration manual (200). Individ- 
ual indexing units, the descriptors, are classified by 
broad categories such as unit operations, equipment, 
physical properties, and hydrocarbon chemicals. Since 
2 or more descriptors are combined to form a search 
entry the descriptors are relatively generic. An ex- 
ample of descriptors used for a search is given by 
Crandall and Stumpf (201). The descriptors are: 

Performance testing 
Improvement by additives 
Oxidation 

Petroleum products 
Lubricating oils 

Since descriptors are only listed and their rela- 
tionships are not indicated, some commonly used des- 
criptors are repeated in two categories to reduce the 
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Incidence of false combinations. For example, chem- 
ical elements are repeated under inorganic chemicals 
and under elements analyzed for. 

The average machine running time for a complete 
literature search ranges from 5 to 6 hours according 
to the complexity of the request (202). 

No overall statistics are given on false drops. 
The illustrated search (203) contained 24% pertinent ref- 
erences. What percentage of the non-pertinent refer- 
ences were noise and what percentage were false drops 
was not indicated. 

In addition to the detailed index which is machine 
searched, the system is backed up with several auxil- 
iary files, called satellite files, which are hand search- 
ed. This includes a trade name file and a principal 
code (subject) file. The latter file is used for spot 
searching where only a few articles are necessary. 

Use: The system was used to conduct 33 machine 
searches in 1954, 41 machine searches in 1955, and 62 
machine searches in 1956. An average of 45 manual 
searches per year was also conducted (204). 

Evaluation: This is one of the earliest applica- 
tions of machine literature searching in an industrial 
organization. The system has now been in operation 
for 7 years. Early experiments on a machine sorted 
punched card system were started as early as 1948 
(205). It is a dual purpose system in that the cards 
are used for the weekly accession lists as well as for 
retrospective searching. The cost of the installation is 
consequently charged to both current dissemination of 
information and retrospective searching. Nevertheless 
the total cost seems to be high in terms of profession- 
al manpower, and the use of the information retrieval 
part of the system seems on the low side. 

The use of one card per reference gives greater 
flexibility in sorting. That is, more descriptors can 
be coordinated in one sort through the machine and no 
subsequent sorting is necessary to identify matching 
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document numbers. But the use of one card per ref- 
erence does result in a larger percentage of false drops 
because of the interaction of non-related descriptors. 
In view of the relatively low use of the system, it is 
probably more efficient to separate these false drops 
manually as is done in this installation rather than to 
try a more complex code in order to avoid them. 

Union Carbide Chemical Company 

Internal reports and patents are included in the 
machine literature searching installation. The number 
of documents in Union Carbide 1 s system is not indicated 
in the 1957 reference but the file is said to consist of 
over 20, 000 cards and to contain some half million en- 
tries from patents and internal reports (206). 

Both direct and random superimposed coding are 
used to encode subject matter. Specific subjects, 
namely individual chemical compounds, are identified 
by a 5 -digit serial number followed by a single digit to 
indicate the role of that particular chemical in the doc- 
ument. These roles are: 

Reactant 

Product 

Catalyst 

Chemical agent 

Materials of construction 

Physical agent 

Negation (removal or absence of chemical) 

The presence in the document of analytical methods or 
physical properties is also indicated by means of such 
role indicators. 

The 6 digits for the specific subject (5 digits) and 
its role (1 digit) are entered directly into a 6 column 
field* Index entries for other than specific chemicals 
are entered as 4 positions in a 10 column random su- 
perimposed code field . These entries are called con- 
cepts and are exemplified by terms such as aldehydes, 
olefins, and oxidation (207). About 1, 500 of these con- 
cepts are used. 
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The punched card is divided into 16 code fields, 
7 of which are used to encode subject matter. Six 
6-column fields are used for encoding up to 6 chem- 
icals and their roles per card. The 10 column field 
is used for the random superimposed coding of con- 
cepts. Since a specific subject can be entered in any 
one of the six code fields without any prior order, the 
matching appears to be by code pattern rather than by 
position. 

Use: No data is given on the use of the Installa- 
tion. 

Claims: 

Every single piece of information in this file 
can be literally read in less than an hour as 

often as requested The cost of a single 

retrieval In technical personnel time is less 
than one tenth the cost of a conventional 
search regardless of the number of subjects 
covered (208). 



2. IBM Electronic Statistical Machine 

Type 101 (IBM 101) 
With Row by Row Scanning Attachment 

An IBM 101 with a row by row scanning attach- 
ment has been described by Luhn (209). This was 
demonstrated at the International Conference on Scien- 
tific Information at Washington, D. C. , in November,, 
1958. The basic coding unit sorted with this machine 
is the 80 column row; there are 12 such rows per IBM 
card. Each row can accommodate 12 six-letter words 
in Hollerith code or even more in a more condensed 
code. Indications of relationships among descriptors 
can also be brought out in the code. 

The machine is programmed by wiring a plug 
board. Searches are made for patterns within a single 
row of a card. One row is searched after another. 
The incidence of a match is stored until all 12 rows on 
a card have been scanned. Searches can be made for 
the presence of any one of any combination of several 
of the desired patterns on the same card (210). 

Results of the search (document serial numbers) 
are printed out as the search progresses so that the 
order of the cards in a file need not be disturbed (211). 
This is a requirement if 2 or more cards are made out 
for any one document, since relationships among cards 
would otherwise be destroyed. 

An experimental system making use of this ma- 
chine at the IBM Research Center is briefly mentioned 
by Luhn. The index was prepared entirely by an IBM 
704, which was fed a machine-readable transcript of 
the documents. (The computer was programmed to se- 
lect indexable information and to translate it into index 
language.) A "nodal index" was used in the experiment. 
This is an index which includes for each key word (des- 
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criptor) all of the other key words which are found to 
have been paired with it in the original texts (212). 

Use: No information is given on the use of the 
system. 

Evaluation: The IBM 101 with row by row scan- 
ning attachment has two advantages over the convention- 
al IBM 101: 

1. Relationships among descriptors can be in- 
corporated into the code. 

2. The code is no longer restricted to the 
fixed capacity of a conventional column by column 
sorted punched card. 

Row by row sorting allows punched cards to be handled 
as a continuous unit, as does magnetic tape. Against 
these advantages must be set the following disadvantag- 
es stated in the report: 

Each set of cards, comprising a single record, 
must include two extra cards, one containing 
control instructions for the machine, and an- 
other to act as a spacer card (213). 

Thus if there is an average of 2 cards per document in 
the conventional IBM 101 system and 4 cards in the 
IBM row by row scanning attachment system, then the 
2 extra cards per document would require in effect 
double the sorting time. 



3. The Universal Card Scanner 

The Universal Card Scanner is another machine 
which was demonstrated at the International Conference 
on Scientific Information at Washington, D. C. , in 
November 1958. This machine and 2 literature search- 
ing applications in IBM libraries are described by Luhn 
(214). These appear to be the only literature search- 
ing applications, since the machine is not commercially 
available at present. 

The operation of the machine is described by 

Luhn (215): 

The Universal Card Scanner (UCS) scans cards 
fed through it in a manner similar to that em- 
ployed on conventional punched card sorters. 
It is capable of discovering whether any or sev- 
eral of a given set of patterns are wholly or 
partly contained in any of the record cards 
scanned. This function is performed by a no- 
pulse matching process under the control of a 
question card which contains prototypes of the 
patterns sought, likewise represented by 
punched holes. This is the adaptation of an 
electronic method to the optical principle of 
'matching by blackout/ employed in an earlier 
IBM card scanning machine, frequently re- 
ferred to as the 'Luhn Scanner 1 . As was the 
case in the earlier mode, the present machine 
features the use of a punched IBM card (Ques- 
tion Card) for furnishing the patterns to be 
searched for in a record file. 

The particular matching process employed 
in the UCS requires that the pattern on the 
record cards be given in complementary form, 
i.e.,, the various marks or elements of the 
pattern need to be represented by the absence 
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of holes and all else by the presence of holes. 

The entire card is scanned in one pass through the ma- 
chine. 

The machine scans a card as a unit, i. e. , 
whatever is contained within the twelve posi- 
tions of the card columns is treated as one 
continuous pattern and a match or lack of 
match is determined once per card on the 
basis of 12 such position patterns. Patterns 
may be of any width desired and a plurality of 
them may be recorded across the card (216) 
at predetermined locations., either adjoining or 
overlapping each other. 

Several types of codes are mentioned for possible 
use with the machine: the standard alpha-numeric 
Hollerith code (217), random superimposed code (218), 
and word coding (words used in their original form), 
made more efficient by spreading the words over a larg- 
er portion of the card using the first letter of the word 
to indicate the starting column (219). Randomized 
square coding is also mentioned. For this code suc- 
cessive letter pairs are used. These letter pairs are 
marked as the intersections of rows and columns, 
where a particular row stands for the first letter of a 
pair and a particular column for the second letter. A 
method called chain spelling is used here; it consists of 
linking the pairs by repeating the second letter of a 
pair as the first letter of a succeeding pair. This 
chain is closed on itself by an additional pair by end- 
around spelling of the last letter and first letter of the 
code word. The code word TUG, for example, is 
spelled TU, UG, and GT. Relationships among words, 
i. e. , the coding of 2 words as a phrase, can be indica- 
ted by a modification of this code. If the code words 
TUG and DEV are to be connected they would be encod- 
ed as TU, UG> GD, BE, EV, VT. Individual words 
can still be identified by disregarding end-around letter 
pairs (220). 

Searches can be made for logical sums, products, 
and differences (221). A search can be programmed to 
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sort cards with different degrees of code matching into 
different pockets. For example, cards which scored 
no code match are sorted into the reject pocket, cards 
which scored one code match are sorted into pocket 1, 
cards which scored 2 code matches are sorted into 
pocket 2, and so on (222). 

Two applications of the UCS are described. One 
experimental system is used for technical reports in 
the technical library of the IBM Military Products Di- 
vision at Owego, N. Y. The system supplements a 
title and author card catalog of technical reports. De- 
scriptors are entered on IBM cards as 3 letter random 
squared codes, as discussed above, in any one of three 
12 column fields. (Only one of the 3 fields has been 
used so far; the other 2 fields are devoted to biblio- 
graphic information) (223). 

The code is entered on the record card (document 
card) by machine duplication of a dictionary card. The 
dictionary card contains the dictionary term in word 
form. Housekeeping information and the particular pat- 
tern of punches are repeated in three 12 column fields. 
Machine duplication can be made selectively in any one 
of the 3 locations on the record card. 

The question card is divided into six 12 column 
fields. Code patterns are entered into the first 3 fields 
by machine duplicating corresponding dictionary cards. 
The code patterns in the 3 additional fields are also 
duplicated by machine but the dictionary card is turned 
around (the former right edge pointing to the left) and 
thus a mirror image is produced in the field. This 
mirror image is compensated for by appropriate wiring 
of the control panel (224). 

The second application at the Information Retriev- 
al Research Department of the IBM Research Center 
deals with literature on information retrieval and ma- 
chine translation. The installation is similar to the 
first example with one major exception. The encoding 
operations were carried out entirely by an IBM 704. 
A card or cards are made for each author, title, and 
source of document. Descriptors are selected by the 
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machine from titles of documents. By means of a 
table look-up, a pre-determined set of insignificant or 
TT common Tf words is excluded from the titles. The re- 
maining words are considered to be significant and use- 
ful as descriptors. These words are listed for each 
document and are stored on magnetic tape. Record 
cards are prepared by the IBM 704 for each document 
(225). 

Use: No Information is given on the use of the 
systems. 

Evaluation: The UCS may be compared with the 
IBM 101 since the machines are similar in size, cost, 
and capabilities. The UCS has several advantages 
over the IBM 101. Its sorting speed is said to be 
1,000 cards per minute (226), which is over 2 times 
the speed of the IBM 101. Randomized square coding, 
a pattern rather than a position code which is not read- 
ily applicable to the IBM 101 f makes very efficient use 
of card space. The space thus gained on the card can 
be used for entering bibliographic information. Random- 
ized square coding also permits the Indication of rela- 
tionships among code words. Programming of the ma- 
chine by inserting a punched card Is faster than the 
control panel wiring required of IBM 101 systems. But 
additional control panel wiring is also mentioned (227) 
and thus the actual saving In time for programming 
might be small. Both the IBM 101 and the UCS are 
capable of handling logical products, sums, and differ- 
ent types of searches. The ability of separating cards 
by the number of matched codes can reduce the number 
of times a deck has to be sorted. This can be done 
with both machines, but In the case of the IBM 101 this 
requires rather intricate wiring. 

A drawback of the UCS is that it is a single-pur- 
pose machine. It does not have the flexibility of the 
IBM 101 which can also be used in statistical work and 
can do a certain amount of printing. 



4. Interrelated Logic Accumulating Scanner 
(ILAS) 

In 1951 an experiment with one of the first row 
by row scanned punched card searching systems was 
presented at an American Chemical Society meeting by 
members of the U. S. Patent Office staff. (Samain 
had suggested the use of row by row scanning instead 
of conventional column by column scanning to obtain 
greater flexibility as early as 1945) (228) The paper 
given by the Patent Office staff members --it was later 
published (229) is interesting for several reasons: a 
mechanism is provided for linking and sorting for re- 
lated descriptors in the code; use is made of specific 
descriptors with more generic descriptors in the form 
of a classification system built in. This last technique 
was subsequently suggested by Perry (230) and called 
preparation of abstraction ladders. It forms the basis 
for the ILAS. 

The operation of the original row by row scanner 
is described by Andrews in the Patent Office Research 

and Development Report #6: 

The machine tested each row of holes for a 
desired pattern before the next succeeding row 
of holes reached the reading brushes. If such 
tests were affirmative the transfer of a relay 
took place and its transferred state continued 
for all succeeding holes of the card as well as 
for an indefinite number of succeeding cards 
until such time as the relay ultimately was re- 
leased by a specially located control punch in 
one of the cards which would then cause the 
sorting of the last card (into a particular pock- 
et other than the reject pocket). By this 
means 4 twelve rather than one logical decisions 
were made for each card but what is more im- 
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portant is that coded data of any length could 
be processed as a single record without being 
confined to a single card (231). 

In June 1956 a Census Bureau 488 Multi-Column 
Sorter, a machine which has many features in common 
with the commercial IBM 101, was delivered to the 
Patent Office. This is the machine which was complete- 
ly rebuilt into what is now known as the ILAS. The 
machine consists of the card sorting unit and a connect- 
ing console in which the logical circuitry is housed. 

The ILAS contains 80 column relays which detect 
the presence or absence of a hole under each of the 
card reading brushes 12 times during each card cycle. 
Each column relay has 12 independent sets of contacts 
having normal (no hole punched) or transferred (hole 
punched) positions so that 12 different combinations of 
80 punches may be detected by proper interconnection 
of all corresponding sets of contacts on each of the 80 
relays. Making these connections by plugboard would 
be a very difficult job, and one almost impossible to 
check. Instead, rotary switches are used. These ro- 
tary switches are mounted on the front panel of the 
console. A hexadecimal code is used for descriptor 
codes programmed by the rotary switches. The hexa- 
decimal code allows 16 combinations (descriptors) from 
4 positions, which in this case are 4 positions in a 4 
column field. Each rotary switch has 20 positions on 
each of the 4 levels and makes the interconnection be- 
tween common, normal and transfer contacts of corres- 
ponding contact sets of 4 successive column relays. 
This means that each of the 16 positions of the 16 des- 
criptors in the 4 column hexadecimal code field can be 
set with the rotary switch. In addition, one position 
serves as a shunt for all 4 column relays with which it 
was associated and another position serves as a shunt 
for the one 4 column field which it controls as well as 
all the remaining column relays of the row which is un- 
der the control of the switches (232). These last two 
positions facilitate programming. Only rotary switches 
which are to transfer will have to be changed (provided 
that the others are in the shunting position); and one 
rotary switch can be used to control all succeeding ro- 
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tary switches in a row if a change in the program re- 
quires the deletion of all codes in a particular row. 

The punched card used for this experiment is di- 
vided into several parts: 

1. Signal: 

The first four columns of each horizontal 
row provide a T Signal T field in which is punched 
any one of 12 different combinations of holes 
to be used as markers. For example if each 
row of holes in the card is considered to be a 
f word f consisting of a number of different hexa- 
decimal characters, then a group of such words 
separated by a distinctive signal could be a 
f phrase/ a number of phrases could be grouped 
to make a 'sentence,' and several sentences 
grouped to form a ' paragraph. T 

This is quite similar to what Perry calls barriers (233). 

A distinctive signal would also be used to mark 
the end of all the codes representing each pat- 
ent or document. These signals serve very 
much the same logical function that parenthes- 
es and brackets serve in mathematic expres- 
sion (234). 

These signals are programmed with a plugboard. 

2. Modulant: a single hexadecimal character 
from the 4 position field in column 5 -- 8 is used to 
modify the remainder of the code word in that partic- 
ular row. An example of modulant use is given in an- 
other Research and Development Report of the Patent 
Office (235). The code word n diazotization n is qualified 
by means of modulants to diazotizing agent, diazotizes, 
diazotizable, is diazotized, and diazotized compound re- 
spectively. A modulant field is provided for each of 
the 12 code words entered on the 12 rows of the card. 
A hexadecimal code controlled by a rotary switch is 
used to program for the modulant. 
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3. Subject matter: the code words (unmodulat- 
ed and unlinked descriptors) are entered in columns 
9-~63. This area is divided into fifteen 4 column fields, 
each field being used for a 4 bit hexadecimal code. (A 
bit is a given position in the row. ) Consequently, a 
code word of up to 15 hexadecimal characters can be 
entered on a single row. This area of the card is also 
programmed by means of rotary switches described a- 
bove. 

4. Interfix: the signal code groups code words 
into phrases, sentences, and paragraphs. This is use- 
ful, but still another grouping device is necessary. 
Andrews explains it: 

Many components must be represented logically 
as being in two or more different groups. A 
code representing such a component could not 
be physically located in more than one group 
(provided by the signal code) and repeating the 
code for each (signal) group would lose the re- 
lationship of the several codes as pertaining to 
but a single component (236). 

To provide for a second order of relationships another 
coding device is included. This is the interfix. An 
example of its use is given by Andrews: 

A gear identified as ? A T drives shaft T B T in 
turn drives pulley 'C. If we code for A any 
one of a set of marks, such as f 3 f and add 
the same number to the code for B meaning 
that the driver-driven relationship is present 
if the two codes contain the same, although un- 
specified, interfix number. Similarly, B would 
be interfixed to C but a different arbitrary in- 
terfix number would be selected, such as *5 T . 
The interfixed codes would then look like this: 

A 3 

B 3, 5 

C 5 (237) 

In addition to the interfix code, A ? B, and C would have 
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a common signal code to group the code words into a 
phrase. Interfixes are entered into columns 69 to 80 

of the card. This area may be handled as a single 
field or as multiple fields of 2, 4, or 6 columns each. 

The interfix is programmed with a plugboard. 

5. Hit relays: since coded data for any doc- 
ument does not have to be confined to a single card, 
some device has to be used to TT remember n that part 
of the search specifications which has been satisfied. 
It must also instruct the sorter to alter the path of a 
card when all the search specifications have been met, 
In the ILAS, 24 hit relays are used for this purpose. 
Each hit relay has separate pickup and dropout con- 
nections and common normal and transfer contacts. 
There are also 48 interfix hit relays which are inter- 
connected to n remember TT each punch in the 12 inter- 
fix columns for up to 4 different code words inter fixed 
in the same column (238). 

Searches for logical products, sums, differences, 
and combinations of these logical operations can be 
performed on the ILAS. The speed of searching is 
450 cards per minute, or the same speed of the IBM 
101. On one occasion this speed was increased to 500 
cards per minute before timing difficulties resulted in 
erroneous sorting (239). 

The card deck has to be kept in exact order; oth- 
erwise relationships established among cards by a par- 
ticular sequence are destroyed. Since all the positions 
on the card are used for the code and no space is pro- 
vided for housekeeping information identification of the 
indexed documentit is particularly critical that the 
order of the cards be not disturbed. In the modified 
IBM 101, the last card in a sequence is diverted in a 
sort pocket if it meets search specifications. This ne- 
cessitates manual card interfiling of these cards before 
another search can be started. A print- out of the pert- 
inent document number would have been the ideal solu- 
tion but the cost of a printing unit was prohibitively 
high. Instead, a procedure called progressive sorting 
is used (240). Cards are fed into the reject pocket un- 
til the first hit is made. At this time the cards are 
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deflected into the next pocket until the second hit is 
made. The cards are then deflected into the next pocket, 
and so on until all the pockets are used. The bottom 
card of each sort pocket identifies the document sorted 
by the machine. The ordered state of the card file is 
maintained by restacking the cards from each pocket in 
sequence. 

Use: Three applications of the machine are de- 
scribed. In a 1957 publication on the ILAS (241) 
Andrews states that the coding system described (with 
signals, modulants, and interfixes) is still in the form- 
ative stage but that the ILAS was tested with the deck 
of punched cards prepared on medicinal compositions 
by the U. S. Patent Office in 1950. Descriptors were 
used for ingredients (a physical admixture of 2 or more 
compounds), complex natural products, and functions 
(disclosures of uses, properties, physiological behavior). 
The ingredients were coded as specific descriptors with 
more generic descriptors built in across the row of the 
card. Each ingredient was assigned a number of codes 
indicating various characteristics of the ingredient such 
as structural groups, function, and source of natural 
product. The multiple codes for each ingredient were 
tied together by a grouping signal. The set of ingred- 
ients in a composition was tied together by another sig- 
nal indicating end of composition. In 4. 5 minutes, 441 
patents containing 6, 262 disclosures characterized by a 
total number of 18, 650 descriptive terms were scanned. 
This is a rate of 95 patents per minute (242). 

The second application of the ILAS, described in 
another Patent Office Research and Development Report 
(243), is in the polymer field. It deals with ehylenic 
unsaturated homo and copolymers, classified in class 
260, subclasses 8094.9 of the Patent Office Manual 
of Classification. The descriptor list includes the fol- 
lowing types of disclosures: 

1. Monomers used in polymerization 

2. Inerts in the reaction 

3. Solvents in the reaction 

4. Catalysts 

5. Processes 
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6. Conditions of the processes 

7. Properties of the product 

8. Uses 

The disclosures are divided into 2 major subdivisions: 

1. Ingredientsterms related to chemical 

compounds 

2, Functions- -terms including non- structure 

terminology, identifying processes, prop- 
erties, conditions of reaction, and so on. 

Each of the chemical compounds is given an identifying 
serial number. This number is entered on the card 
with a hexadecimal code. By using 4 hexadecimal 
codes in a field, a total of 65, 536 unique combinations 
can be obtained. To put it in another way, when 16 
positions are used in a row any one number from 1-- 
65, 536 can be obtained. In addition to this identifying 
number the genus- -species relationship is given for 
each chemical. For example, in the case of the spec- 
ific compound of ethylene the generic term mono olefin 
hydrocarbon would also be coded. 

Functions of the chemical, e. g. solvent, catalyst, 
comonomer, are indicated in the modulant field by 
means of 2 hexadecimal characters. This provides a 
total of 256 unique modulants. 

Related codes are linked together with a grouping 
signal- Thus, a phrase can be made out of the codes 
for ethylene, mono olefin hydrocarbon, and comonomer. 
Another grouping signal is used to connect all the codes 
which pertain to the same document. 

Interfixes are used to show interrelations among 
various groups, for example the group for specific 
compounds and a specific process. Interfixes can also 
be used to show sequence in a process. 

Numerical information is also coded by ranges. 
This is useful for indicating operating variables in a 
process. These variables might be pressure, temper- 
ature, or time. 
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Only half of the card, i. e. a 40 bit word per row 
instead of an 80 bit word per row, is used for this 
system. The first 4 bits (positions in each row) are 
used for signals, the next 8 bits are for the modulant, 
the next 16 bits are for subject matter, and the next 
12 bits are for the inter fix. The system is in opera- 
tion but no data is given on frequency of use. 

The third application of the ILAS is described by 
Leibowitz, Frome, and Andrews (244). The subject 
matter covered in this experiment is in the thiazine 
art. This refers to a group of sulfur -containing heter- 
ocyclic compounds. The entire 80 positions of each 
row on an IBM card are used to encode subject matter 
and relationships. The arrangement of the card is 
similar to that described above (245). Three levels of 
descriptor groupings are used: 

Ring systems or chains (a chain is a contin- 
uity of chain units consisting mostly of func- 
tional groups in an acyclic arrangement of el- 
ements). 

Compounds 
Patents 

Each of these levels is connected with a grouping sig- 
nal. Groupings within smaller units, e. g ring to 
chain unit, are made by interfixes. The system per- 
mits variability in scope because: 

1. The 'building blocks' of the system are 
small units. These units are separately and 
independently described, which permits the ask- 
ing of a large number and a large variety of 
search questions for retrieval. 

2. The descriptors are variable in scope one 
descriptor may merely indicate a 6- member ed 
ring while another indicates a positional re- 
lationship of heterocyclic elements or substi- 
tuent groups. . . . 
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3. Each collection of codes is gathered into a 
substructural entity. This permits search 
questions with respect to chemical compounds 
which vary in scope as desired with respect 
to selected portions of the molecule. 

4, The groupings and interfiles provide the 
ability to specify relationships among the sub- 
structures and to obtain as much specificity as 
desired with respect to the compound search 
(246). 

An interesting approach is the encoding of the com- 
pounds from the formulas by clerks. The system is 
being tested (247). 

Evaluation: The Patent Office is experimenting 
with a number of systems. The ILAS is only one of 
them and as such does not represent the ultimate ap- 
proach to the Patent Qffice T s problem. There are 
some very interesting features in the ILAS. Most elec- 
tronic systems for information retrieval formulated so 
far have been limited to a mere listing of descriptors. 
One thus gave up something very real when one departed 
from the traditional indexing system. In many cases 
the indexing was not so detailed, nor the subject matter 
complex, nor the size of the collection so large as to 
cause real trouble. The problem at the Patent Office 
demands a system which will permit precise specifica- 
tion of relationships among descriptors. This is made 
possible with the ILAS by 'means of the grouping signal 
and* the inter fix. There is no IBM keypunch now avail- 
able which punches hexadecimal codes in rows of 
punched cards. A two step operation is thus necessary 
This is somewhat awkward and increases the clerical 
cost of the operation. The sorting rate of 450 cards 
per minute is not very fast for a large file, particular- 
ly since 2 or more cards are used per patent (an aver- 
age of 5 cards per patent in the first experiment) (248). 

Andrews compares the ILAS with a computer: 

At the rate of 450 cards per minute 90 rows of 
holes or 7200 bits of information are scanned 
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each second. This is approximately one 
twelfth of the scanning rate of commonly used 
magnetic tape input units (90, 000 bits per sec- 
ond) but the fact that 12 sets of comparisons 
are made simultaneously on the fly without ad- 
ditional computational time further reduces the 
gap between punched card equipment and elec- 
tronic computers costing at least 100 times as 
much as the ILAS machine. It should also be 
noted that the time required to prepare and 
load the machine with the data defining the 
search specifications usually takes only a mat- 
ter of minutes and is a direct operation sub- 
ject to visual checking whereas preparation of 
data for a computer usually must pass through 
several hands and, if trouble develops from 
the data, requires elaborate checking proced- 
ures* 

He does add, however: 

The magnitude of the entire Patent Office 
searching operation is such that ultimately the 
largest and fastest available computer or spec- 
ialized machine will be necessary (249). 



F. Computers 
1. Paper Tape 
Western Reserve University Searching Selector 

A pilot project sponsored by the American Society 
of Metals is now being conducted at Western Reserve 
University to demonstrate the feasibility and advantages 
of applying computers to the retrieval and correlation 
of metallurgical literature. The project consists of in- 
dexing 25, 000 metallurgical publications over a 5 year 
period (1955-60) and testing the indexed file. Informa- 
tive abstracts from Metallurgical Abstracts, the Journ- 
al of the Iron and Steel Institute, and Chemical Ab- 
stracts are used as sources for indexable information 
(250). 

The indexing technique for this project was devel- 
oped by Perry, Kent, and Berry and is described in 
their book on machine literature searching (251). In- 
dexable information in a document is translated into an 
artificially defined group of symbols called a telegraph- 
ic abstract. Among these symbols are semantic fac- 
tors, defined as 

... a carefully defined set of generic concepts 
... to serve both -as reference points in desig- 
nating the scope of information requirements 
and as a basis for designating important as- 
pects of subject contents of graphic records 
(252). 

A few hundred semantic factors sufficed as a basis for 
encoding a broad range of scientific and technical terms 
(253). Examples of encoded semantic factors are: 



230 



Retrieval Systems 231 

B CT bacteria 

B SR absorb 

M CH machine 

T TR water 

Semantic factors are supplemented with 1 -letter 

codes to indicate permanent relationships; these are 
called analytic relationships, and are exemplified by: 

A class inclusion 

E material of composition 

Y attributive 

X absence of 

Semantic factors coupled with analytic relationships are 
illustrated by the following: 

MACH thermometer is a member of the class 

machine 
TXTR anhydrous or absence of water (254) 

Empirical relationships, called synthetic relation- 
ships, are denoted by a 3-letter code such as: 

KAJ starting material 
KEJ material processed 
KOV properties given for 

In addition to the semantic factor and the 2 types 
of relationship indicators, the following devices are 
used in the telegraphic abstract: brackets, parentheses, 
and braces between symbols that constitute various des- 
ignated runs such as phrases, sentences, and para- 
graphs. There are also other symbols to distinguish 
between codes for terms that are so closely related 
that they would have the same semantic factors or to 
distinguish a given Individual whose class has been en- 
coded. That is, they distinguish between different 
models of a machine or between a pyrometer and a 
thermometer (255). 

The operation of the system is described by Kent, 
Melton, and Flagg (256). Published abstracts which are 
to be incorporated into the system are translated into 
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telegraphic abstracts. Abstracts are first translated 
into a standard form by eliminating the variations and 
complexities of the English sentence structure. This 
task is facilitated by making out a subject matter anal- 
ysis form. The end result of this step is a telegraph- 
ic abstract on the worksheets. The next step is to 
code the individual terms and phrases of the telegraph- 
ic abstract with the aid of a code dictionary. The en- 
coded telegraphic abstract is then translated into ma- 
chine form by recording it both on punched cards and 
8 channel punched paper tape. The telegraphic ab- 
stracts on 8 channel tape constitute the "library" which 
is scanned by the machine in a search. 

The hardware used in searching consists of a 
modified Flexowriter and a panel of circuits, cable- 
connected with the Flexowriter. The code of the ques- 
tion is spelled out by appropriate wiring. The logic of 
the question is programmed by additional levels of wir- 
ing from one panel to another. Logical sums, products, 
differences, and combinations of these logical operations 
can be used in searching. 

The operation of the selector is described by Kent, 
Melton 7 and Flagg: 

The Flexowriter f reads 1 8-channel punched pa- 
per tape in its reading unit. The punched 
tape contains the encoded library which is 
being searched. The panel of circuits re- 
ceives impulses from the Flexowriter, accord- 
ing to which symbol punched on the tape is 
being 'read 1 . Therefore, if the letter T A T is 
being read -by the Flexowriter, the set of hubs 
on the Selector which correspond to the letter 
? A f will receive voltage. If the letter T A T ap- 
pears in our question, we would have plugged 
a wire into an f A' hub, so that when the volt- 
age appeared, the logical apparatus of the Se- 
lector is put into motion. If the letter *A ! is 
not in our question, then voltage at the T A T 
hubs will disappear or be f washed out' as soon 
as the next symbol punched in the tape library 
is read. 
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After the question on (for example) titanium 
has been programmed into the Selector the 
'Send* button on the Flexowriter is depressed 
(which conditions the Selector to receive im- 
pulses from the Flexowriter) and then the but- 
ton marked 1 start read' is depressed. The 
search of the punched paper tape is thus in- 
itiated (257). 

The result of the search can be a serial number 
of an abstract or paper, the bibliographic citation,, or 
the entire abstract. The Selector presently as exper- 
imental model to try out the system of encoding devel- 
oped by Perry and his colleagues- -is slow. Only 1 ab- 
stract is searched per minute. At one time, 10 ques- 
tions can be searched (258). Procedures are being 
worked out in cooperation with the Eastman Kodak 
Company to encode the telegraphic abstracts onto Mini- 
cards (259). The use of magnetic instead of paper tape 
is also being considered in order to increase the speed 
of searching. 

Evaluation: The WRU Searching Selector is thus 
far an experimental model which is admittedly too slow 
for searching. The system is intended for more sophis- 
ticated machines which, according to Shaw, do not exist. 
In a review of Perry, Kent, and Berry* s book on ma- 
chine literature searching Shaw states: 

The basic assumption that underlies this ser- 
ies of studies is that we have machines cap- 
able of literature searching As a matter 

of fact there are no machines in production 
that will even do a fraction of what is claimed 
here (260). 

The telegraphic abstract-based system is also ques- 
tioned because of the resulting large record of the in- 
dex to a document in machine form. Shaw comments: 

It is difficult for this reviewer to see how 
much space is saved by using MUSRMACHTWMP 
03 for the word thermometer. It requires 40 
spaces to write Springfield, Illinois and 44 to 
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write Chicago, so this notation would overflow 
from a punched card with only Springfield and 
Chicago needed on one card. The usefulness 
of a notation this long on a medium capable of 
storing a total of 80 characters only is open 
to question (261). 

Punched paper tape instead of punched cards is now 
used by Perry as the storage medium; nevertheless 
Shaw's argument is still highly pertinent. The system 

is cumbersome. 

The amount of detail which is encoded in illustrat- 
ed telegraphic abstracts (262) is also open to question. 
The professional and clerical input cost of the system 
seems to be unduly high and in poor balance to the out- 
put cost* The large resulting machinable record ap- 
pears to require a machine with an unnecessarily high 
searching rate, especially since we are dealing with an 
end-to- end search type of system. The detail of en- 
coding appears to be based on the expectancy of search- 
es for very detailed information. This type of search 
might be more efficiently conducted by restricting the 
machine search to the elimination of the bulk of the 
nonpertinent material and finishing the search manually. 



Bendix G 15 D Computer 

This computer consists of a Flexo writer- -the in- 
put, output, and control unit and a console which con- 
tains the arithmetic unit and the 2, 176 word (63, 004 
digit) magnetic drum memory. An auxiliary magnetic 
tape memory can also be incorporated into the com- 
puter, as can punched card and magnetic tape input and 
output. The input and output operations proceed with- 
out interrupting computation (263). 

Only one application of this computer to literature 
searching has been reported thus far. The work was 
done jointly by the U. S. Patent Office and the E. I. 
DuPont de Nemours Company. It is reported in ORD 
Report #13 (264), in which a code for ethylene polymer 
art to be used with either the ILAS or the Bendix G-- 
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15 D is discussed. The basic code has been reported 
above under the ILAS section. The principal refine- 
ment introduced for the Bendix G--15--D is called the 
weighting procedure. 

Each subject group in a question set in as- 
signed a relative numerical value, called 
weight, in accordance with its considered im- 
portance. During the search the machine op- 
erates in the following manner. When the sub- 
ject is found for its first appearance in the 
disclosure its weight assigned to the subject is 
recorded. The weight is not recorded for any 
additional finding of the same subject. As each 
additional subject is found its weight is added 
to the previously recorded weights of other 
subjects. The total weight at the end of the 
search is compared with a minimum weight 
which has been assigned concomitantly with 
the questions. Assume, for example, subject 
A is assigned a weight of 1, B a weight of 2, 
and C a weight of 3. When the machine finds 
subject A, it records weight 1 and when it 
finds C it adds 3 to 1 and if it finds B adds 
2 to give a total weight of 6. A complete 
answer to the question is A plus B plus C. 
If, in addition, answer B plus C is also ac- 
ceptable, the minimum weight assigned would 
be 5 (265). 

The results of the machine search are printed out by 

the computer. The print out for each document includes 
the relative weight of the document (maximum or mini- 
mum). 

Use: A flow chart and program were written for 
the Bendix G--15--D Computer and an operable program 
was set up. Preliminary tests with a computer pro- 
gram have been encouraging (266). 



National Bureau of Standards, 
Automatic Microimage File 
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The Automatic Microimage File is a device for 
the rapid location and copying in enlarged form of mi- 
crofilm images. The device is described in two very 
similar articles (267, 268). Images of documents or 
other forms of information are reduced to 0. 1 inch 
frames of microfilm. On a 10-inch square sheet of 
microfilm, 10, 000 of these frames are recorded. This 
sheet is termed a matrix. The locations of these 
frames on the 10-inch square microfilm are recorded 
as 20 bit numbers on a perforated teletype paper tape: 

The instrument is essentially a combination of 
digital computer electronic circuitry and a 
pair of precision servomechanisms that search 
X and Y axes of the matrix. The location of 
the desired frame is fed into a 20 bit (binary 
digit) register from the teletype tape. The 
register consists of a capacitor memory and 
coincidence identification circuitry. The first 
10 bits recorded in the register control the Y 
position selection while the second 10 bits con- 
trol the X position. 

The matrix is supported on a drum 10 in. 
in diameter and is fastened at one edge with 
dowel pins to insure its accurate location on 
the drum. The drum is servo-controlled in 
both linear and rotary axes of motion, corres- 
ponding to the X and Y axes of the matrix. 
The servos that shift the matrix to the chosen 
coordinates are mechanically coupled with pre- 
cision gearing to two code commutators. 

The code commutators, one associated with 
each axis, control the coordinate positions to 
which the matrix is located. These commuta- 
tors are photoetched with one hundred 10-bit 
numbers corresponding to the standard teletype 
binary bit code. The two particular positions 
on the commutators are selected by a serial 
mechanical search with contacting brushes un- 
til a code combination is found that matches 
the binary bits recorded in the 20-bit register. 
Magnetic clutches and brakes provide rapid 
starting and stopping of the drum with uniform 
overtravel in the location of every position on 
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the matrix. A single induction motor supplies 
all motive power to the machine. 

At the beginning of the cycle of operation, 
a teletype tape reader reads a 4-decimal-digit 
number into the 20-bit register in terms of a 
binary digit code. A space symbol is custom- 
arily inserted in the teletype tape following 
each 4-digit number. On detecting this space 
symbol, the machine's program control stops 
the tape reader, engages the magnetic clutches 
on the X and Y servos, and looks for the 
compatible code on the two coordinate axes. 
When the compatible code is found, the clutch- 
es disengage and magnetic brakes stop the 
drum. A print lamp is briefly turned on to 
make a photographic exposure of the selected 
microfilm frame on the photosensitive paper. 
When the exposure is completed, the teletype 
tape advances to the next instruction, the drum 
returns to its zero position, and the machine 
proceeds to the next search cycle (269). 

The Automatic Microimage File appears useful 
for storing documents in highly reduced form and for 
obtaining somewhat enlarged copies of these reduced 
documents. The images thus produced will have to be 
further enlarged before they can be read by the naked 
eye. The instrument is not a subject index searching 
device since it can only make copies of documents 
whose location is specified. 

The Automatic Microimage File might have appli- 
cation in a 2- step searching system. The first step in 
such a system would be the subject search of the index. 
Document serial numbers would be the result of this 
search. The serial numbers would also indicate the 
location of the documents in the Automatic Microimage 
File. The second step would be the retrieval of copies 
of potentially useful documents by means of the Auto- 
matic Microimage File. 



2. Magnetic Tape 

The Special Purpose Electronic 

Searching Machine 

One of the earliest suggestions for using a digital 
computer for information searching was made by Bagley. 
In 1951 Bagley and Perry presented a paper on this 
subject, based on Bagley' s Master of Science thesis 
(270). The paper is interesting for two -reasons. First, 
it gives Bagley 1 s and Perry's early opinions on the di- 
vision of labor between machine and manual searching. 
They state: 

Information must be analyzed and encoded in 
such a way that a search by machine for a 
given subject will direct attention to all the 
pertinent documents while rejecting nearly all 
those documents which are not of immediate 
interest. It is not advisable to require that 
machine searching do more than this and re- 
ject all material at or near the borderline of 
interest. In the first place, pinpoint accuracy 
in machine searching would require excessively 
detailed analysis of information. Furthermore, 
material near the borderline of interest may 
prove surprisingly useful to the person request- 
ing the search. A reasonable division of labor 
should be the goal, with the machine screening 
out so much of the unwanted material that final 
review of the machine-selected items placees 
no excessive burdens on the person for whom 
the machine search was made (271). 

The second interesting point made in this paper is the 
relative slowness of the general purpose digital comput- 
er when applied to information searching- -this despite 
the fact that the then fastest computer, the Whirlwind I, 

238 
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could perform 16, 000 operations per second. Bagley 
and Perry write: 

Our study revealed that a search of average 
complexity, in which the machine is required 
to scan an encoded index. . . requires that Whirl- 
wind in its present form would take about 3 
seconds to scan and inspect the entries for a 
single document. By redesigning the computer 
element of the machine so that it might per- 
form certain operations more simply than at 
present, it might be possible to reduce the 
time of searching of a single index block by a 
few tenths of a second- -but no more* At this 
rate it would still take more than 800 hours 
to search the index to a million documents 
(272). 

The reason for this slowness lies in the design 
of the general purpose digital computer. The machine 
has only one computing element to perform all the op- 
erations on the numbers in the computer. The rest of 
the machine is devoted to storage of numbers, to the 
control of these numbers when the machine is in oper- 
ation, and to input and output devices. Much of the 
computer operating time is spent in transferring num- 
bers back and forth and in keeping track of results of 
past operations. Bagley and Perry estimate that the 
actual identifying operations require 1% of the total 
running time; the other 90% is consumed, they believe, 
in transferring and storing intermediate results. As a 
solution to these problems, Bagley and Perry suggest 
the construction of a special purpose searching machine: 

A major advance in the rate of scanning and 
selecting could be accomplished if the machine 
were designed so that all criteria used to de- 
fine the search would be checked simultaneous- 
ly against each index entry being scanned. It 
would also be advantageous to design the elec- 
tronic searching machine with a separate op- 
erations element to perform the two different 
types of operations namely, one, detecting 
the identity of search criteria to index entries, 
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and two, establishing the index entries pertain- 
ing to some one document stand in the spec- 
ified relationship to each other. As a conse- 
quence, the special-purpose searching machine 
would be much simpler in design than the gen- 
eral-purpose computer. In particular, it is 
possible to design a relatively simple device-- 
the comparator --for simultaneous scanning of 
index entries and another simple unit- -the log- 
ical computer- -for establishing that specified 
relationships exist between the index entries 
detected by the comparator (273). 

The input into this machine would be the coded index as 
binary numbers on magnetic tape. In order to be able 
to run the tape through the machine at the rate of sev- 
eral hundred inches per second a special base for the 
magnetic tape is suggested. This is a durable alloy 
rolled into a strip of about 1 inch width. Program- 
ming the machine should eventually be by punched cards. 
A magnetic drum could be used for temporary storage 
of the output of the machine until it could be printed 
either by an electric typewriter or by a device which 
burns characters on a paper tape. The special purpose 
searching machine can be designed to scan the index 
for 1, 333 documents every second. This is equivalent 
to 4. 8 million documents per hour, though Bagley and 
Perry indicate that this speed might be reduced some- 
what since information cannot be fed into the machine 
or removed from the machine at this rate. 

The paper represents an interesting example of 
pure "blue sky n thinking. Although it is cited frequent- 
ly in the documentation literature, no further work 
seems to have been done on the proposed system. 



Standards Electronic Automatic Computer (SEAC) 

The HAYSTAQ (Have You Stored Answers to 
Questions) system is an experimental machine literature 
searching system jointly developed by the U. S. Patent 
Office and the National Bureau of Standards. Subject 
matter selected for encoding into this system includes 
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chemical compounds, admixtures, and processes dis- 
closed in patents. Searches are made on the SEAC 
computer. The SEAC is a general purpose digital com- 
puter located at the National Bureau of Standards offic- 
es in Washington. Changes have been or are being 
made in the computers memory capacity, input me- 
dium, and logical operations repertoire to increase the 
computer's efficiency for this application. Another 512 
words of mercury delay-line memory have been added 
bringing the total memory capacity to 2048 words. 
Eight Ampex tape units are being installed to meet the 
need for a large-capacity, high speed input medium. 
The disclosure file in the index will be stored on 6- 
channel tapes, which will be approximately 2, 000 feet 
long with a packing density of 200 bits to the inch and 
a speed of 40-60 inches per second. A shift order for 
either right or left shift and an equality comparison 
order have been added to SEAC f s command capabilities 
(274). 

The final HAYSTAQ system will include: 

1. A data preparation routine for the 
"library" making up the complete disclosure 
file of information; 

2. A data preparation routine for the ques- 
tion; 

3. The search routine with included sub- 
routines; 

4. The checkout routine which evaluates 
the apparent answers found to questions. 

Only the search routine has been written and "debugged" 
(275). 

Chemical disclosuresindexable information, in 
this case, to be encoded into the system are grouped 
by document. The document is divided into composi- 
tions, that is, into physical admixtures of materials. 
A composition is further subdivided into ingredients or 
items. In the field of chemistry, descriptors are used 
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for functional groups, compound formulae, and molec- 
ular structure. The item itself is made up of a series 
of descriptor words. The levels of organization are 
thus document, composition, item, and descriptor. 

In a search, the question is stored in SEAC T s 
high speed memory and remains there throughout the 
search. The disclosure file, except for the encoded 
molecular structure data, is stored on a reel of mag- 
netic tape called the principal tape. One disclosure 
composition at a time is read onto SEAC ! s memory and 
compared against pre-stored question compositions. The 
search progresses through various levels of organiza- 
tion. The question descriptor is matched against the 
disclosure descriptor. If no match is found for any 
question descriptor, further searching of the item is 
unnecessary and the search progresses to the next item. 
If any question cannot be matched, the search reverts 
again to the composition level and a new composition is 
inserted into the SEAC memory. 

There are 3 types of descriptor searches: 

Chemical descriptors (functional groups) 
Empirical formulae 
Molecular structure 

When an apparent answer to a question item in the 
principal tape is found (a chemical descriptor or an 
empirical formula) a hit word is stored on one of the 
auxiliary magnetic tape units. If the question calls for 
a molecular structure search, the molecular structure 
data which is encoded onto a separate magnetic tape-- 
called the secondary tape--is read into the SEAC. The 
principal and secondary tapes are so coordinated that 
coded structural data is immediately available when re- 
quired (276). 

In case all the question items in the composition 
have found answers the validity of the answers is 
checked by a machine program so that false drops can 
be eliminated. This is done by reading the complete 
set of hit words in from the tape and processing it 
through the check-out routine. In the case of a mixture 
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of materials the relationships among the hit words is 
checked to see whether these relationships are the 
same as the relationships sought in the question. When 
complete answers are checked out, the identification of 
the document is printed and the search continues with 
the next document (277). 

The HAYSTAQ system requires an end to end 
search and various devices are used to speed up 
searches. The ordering of data is one such device. 
All descriptors in an item are arranged in ascending 
series according to their first digits which identify the 
category of subject matter. Whenever the first digit of 
a disclosure descriptor is greater than that of the ques- 
tion it is useless to look further in that disclosure item. 
The next item is considered. Data is also screened at 
the document level to eliminate fruitless searching. If, 
for example, the question requires a particular process, 
the document is examined to determine whether any pro- 
cess is included. If not, the entire document is elim- 
inated from the search. Further screening is done to 
see if the document contains at least as many composi- 
tions as are required by the question. 

An example of a chemical structure search called 
the topological structure search is given in an ORB re- 
port (278): 

Each atom (except non- significant H, which are 
the ones attached to the C of the hydrocarbon 
skeleton) and each significant bond (bonds other 
than single) in the structure is assigned an 
arbitrary serial number or f interfix f . Consec- 
utive numbers beginning with I are assigned to 
the bonds and elements in a completely random 
manner. The complete notation for each atom 
or bond coded is contained on one computer 
word and includes a symbol identifying it (e. g. 
by atomic number), its serial number, and the 
serial number assigned to each of the atoms 
or bonds to which it is connected. Both ques- 
tions and disclosures are similarly coded. The 
serial numbers assigned to the bonds and ele- 
ments in a question will seldom be the same as 
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those assigned to a corresponding structural 
fragment of a disclosure compound containing 
the fragment sought. However, correspond- 
ence of numbers in this respect is not re- 
quired by the computer program. . . . During 
the search, the machine inspects the first 
atom listed in the question code and each of 
the atoms in the disclosure code being consid- 
ered until it finds a match. The atoms con- 
nected to this question atom are then deter- 
mined and their correspondence to those con- 
nected to the first acceptable disclosure atom 
are compared until further matches are found. 
This continues until all of the bonds and atoms 
in the question, together with their proper 
connective relationships, are examined in the 
disclosure compound or until it is determined 
that no such total matching is possible. Note 
that the carrying forward of such comparisons 
is based upon identifications of elements and 
bonds and the identification of the adjacent el- 
ements and bonds. 

Use: In 1957, a file of 250 complex steroid 
structural formulas was completely searched in 8 min- 
utes. It was claimed that further experiments to de- 
termine the best method of asking a question have 
shown that it is possible to lower this time to approx- 
imately 8 seconds (279). In 1958, tests on the 
HAYSTAQ system were also conducted by simulating a 
computer on a blackboard (280). 

HAYSTAQ is considered as an example of a gen- 
eral approach to a large scale, mechanized searching 
system. It is said to imitate to a large extent the 
search performed manually by a human searcher (281). 
The system is still considered highly experimental by 
its developers (282). It has been primarily an exer- 
cise in developing methodology in a field in which no 
guiding generalities have been established. 



IBM 700 Series Electronic Data 
Processing Machines 
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The IBM 700 Series machines are general purpose 
large scale computers which, as their name indicates, 
are manufactured by the International Business Machines 
Corporation, The IBM 701 and 704 are primarily used 
for scientific and engineering calculation (considerable 
calculation on a relatively small amount of data), while 
the IBM 702 and 705 are used primarily for accounting 
or similar commercial applications (a small amount of 
calculation on a large amount of data. ) 

The IBM 704, for example, has the following fea- 
tures: 

Input 

Punched cards 
Magnetic tape 

Memory 

8, 192 words (10 decimals and a sign) 

magnetic core memory 
8, 192 or 16, 384 words magnetic drum 

memory 

Output 

Off-line printer 

On-line high speed printer 

Cathode ray tube 

Magnetic tape 

Punched cards (283) 

IBM 701: 

U. S. Naval Ordnance Station, China Lake, California 

The searching system at the China Lake U. S. 
Naval Ordnance Station includes only reports. It does 
not include periodicals and books. The subject matter 
is for the most part related to the development and 
testing of items of Naval ordnance (284). A manual 
Uniterm system used at this installation since 1953 was 
converted for IBM 701 searching although the manual 
Uniterm system itself was not discarded (285). When 
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the information was transferred onto tapes for machine 
searching 14, 000 reports were covered by the system. 
About 9, 600 descriptors were used and each document 
was indexed under about 8 descriptors. The index to 
14, 000 documents was stored on 1 1/3 reels of mag- 
netic tape. About 300 new reports were added monthly 
(286). 

Each descriptor was assigned a number and was 
punched on a set of IBM cards together with all the re- 
port numbers associated with that descriptor. Search- 
ing consists of an input, searching, and output phase. 
Bracken and Tillitt state: 

Phase I, Input: The input phase first read in- 
to memory up to 75 sets of punched cards. 
Each set of cards, called a data set, contains 
the necessary information to conduct one 
search. The first card of a data set gives 
the number of descriptors to be used in a 
search and controls the output printing of the 
search. For example, given eight descriptors 
for a search, it is possible to print the match- 
ing report numbers for all eight descriptors, 
plus the matching report number for the first 
two, the first three, through the first seven 
descriptors. 

Two additional cards complete a data set 
(only one card if six or less descriptors) and 
contain both descriptor numbers and particular 
report numbers. These descriptor numbers 
define the subject and the report numbers are 
used as starting points in the search. . . . The 
data sets are first written on a drum, the 
descriptor numbers for all data sets are then 
sorted (eliminating duplication) and the corres- 
ponding unit records on the library tape are 
found and written on tape. . . . 
Phase II, Searching: Phase n executes the 
searching procedure using the working tape 
and the data sets as prepared by Phase I. A 
data set is first read from drum into memory. 
The first two unit records corresponding to 
the first two descriptors of the set are found 
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on the working tape and stored in memory. 
The report numbers are then compared and 
matching ones saved. (They will be written 
on the result tape if a print, for these two 
alone, has been specified on the data set con- 
trol card). The third unit card corresponding 
to the third descriptor of the data set will be 
read from the working tape and its report 
numbers previously found. This process is 

repeated for all of the remaining data sets 

Phase m, Output: Phase ffl prints the result 
tape. There may be a maximum of seven list- 
ings for each search if the search used eight 
descriptors, and if the data set control card 
specified a listing of matching report numbers 
from the first two, three. . . eight descriptors. 
For each listing, the identifying descriptors, 
the report numbers. . . and the matching report 
numbers are printed. If no matching report 
numbers are found then the word NONE will 
print (287). 

A maximum of 8 descriptors can be coordinated 
in one search. Maximum space for 6, 750 report num- 
bers has been provided for each descriptor (288). Six- 
teen library searches are made 3 times a week; a total 
time of 11 minutes is required for the 16 searches, not 
including the time it takes to change the tapes (289). 
Plans are being made to install new tapes having twice 
the density and twice the length of the present tape. A 
core memory will be added to the IBM 701 This is a- 
bout twice as fast as electrostatic memory. Such im- 
provements will cut down the time for 16 searches to 
approximately 5 minutes (290). 

Bloomfield formerly Head of the Acquisitions and 
Records Branch of the Library Division at the China 
Lake Naval Ordnance Station- -comments on the first 
experiment with 3, 000 reports prepared for IBM 701 
searching: 

I believed that the reference librarians could 
do all the coordinating necessary in the amount 
of time it would take for IBM processing. . . . 
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It is necessary to go to the accession list to 
find the titles of the accession numbers re- 
vealed by the Coordinate Index. This is also 
true for the accession numbers the IBM ma- 
chine would produce. Also, one would have to 
go to the code to get appropriate numbers of 
the Uniterms so that the IBM computer could 
search its tape. If one needed to enlarge 
upon our initial approach by searching more 
UnitermSj the use of the Library 7 s Coordinate 
Index would be simple, but if the IBM com- 
puter were involved it would mean rerunning 
the tape Finally I was of the opinion that 
reference work requires some imagination and 
the IBM computer has none (291). 

A note to Bloomfield's paper indicates that the paper 
does not represent the official view of the Naval Ord- 
nance Test Station (292). 

A note from Miss Canova, librarian at the Station, 
states that the searching program is scheduled to be 
transferred to the IBM 704 computer. She expresses 
the hope that results of the machine search can be ex- 
panded to give the corporate entry and title as well as 
the accession number (293). 

The use of a high speed, higher cost computer in 
this installation can be questioned on several grounds. 
One was mentioned by Bloomfield when he indicated that 
it takes just as much overall time to do a search with 
a computer as it takes for manual searching of numbers. 
It must be remembered that in addition to the actual 
computer operating time, time must be added for trans- 
lating descriptors into machine language and for pro- 
gramming the computer. 

At present the computer search does not provide 
any more information than the equivalent manual search, 
namely document serial numbers of potentially pertinent 
documents. Plans are being made to include the cor- 
porate author and title as part of the search result, but 
it seems somewhat questionable whether this is more 
efficient than having a clerk manually pull the accession 
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cards of the documents themselves from the file. 

Lastly, since the manual coordinate index is still 
being kept, it is doubtful whether the extra amoung of 
effort involved in preparing the computer based system 
is warranted even though "free time" might be avail- 
able on the computer. 



IBM 704 Electronic Data Processing Machine: 
General Electric Company, 
Aircraft Gas Turbine Division 

Over 30, 000 documents are included in this index- 
ing system for scientists and engineers of the Aircraft 
Gas Turbine Division (AGT) of General Electric. The 
system is based on the use of descriptive keywords 

(descriptors) and document file numbers. All technical 
documents in the AGT library are identified in terms of 
these keywords. Each document may have a dozen or 
more words to describe it. There are over 7, 000 key- 
words describing the documents. The arrangement is 
by document numbers on descriptor cards. The index, 
i. e. the document numbers on descriptor cards, and a 
concise abstract are entered on magnetic tape. The 
30, 000 abstracts are encoded on 3 reels of magnetic 
tape. 

The retrieval system, using the IBM 704 computer, 
can search through the entire list of numbers in less 
than 3 minutes. The high speed printer can print out 
abstracts of documents found in the search in less than 
15 minutes, depending on the number of documents 
found (294). It is stated: 

In its present form the system can accommo- 
date 1,000,000 abstracts, 56,000,000 file num- 
bers, and can perform up to 99 simultaneous 
literature searches. . . . New computing equip- 
ment is available which could increase the 
speed of our present system by 1, 000 percent, 
while the document storage capacity would be 
around 10, 000, 000. . , . An automatic informa- 
tion retrieval system that searches out written 
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information 1, 000 times faster than a man can 
do has been developed and placed in operation 
for the technical library at General Electric 7 s 
Aircraft Gas Turbine Division in Cincinnati, 
Ohio (295). 

No other details are given about the system. 

Evaluation: The IBM 704-based system, like the 
IBM 305 RAMAC-based system, employs an inverse 
order of filing. The filing of document numbers on des- 
criptors here are merely listed. No relationships a- 
mong descriptors are specified. It is not stated wheth- 
er equal space is provided after each descriptor for 
document numbers. The method for entering document 
numbers on the descriptor cards is not indicated. 

The system appears to be a mechanized version 
of the "Uniterm" or TT Peek-a-Boo n system with one 
added feature. Abstracts of potentially pertinent docu- 
ments can be printed out by the high speed IBM 704 
printer. Whether this is more economical than having 
copies of these abstracts pulled out of the file and du- 
plicated is not discussed. 

How information is searched 1, 000 times faster 
than with a manual system such as the manual "Uni- 
term" system equivalent is not mentioned. 



IBM 705 Electronic Data Processing Machine: 
Institute for Cooperative Research, 
University of Pennsylvania 

A system for searching document collections with 
an IBM 705 was designed at the Institute for Coopera- 
tive Research, University of Pennsylvania; bibliograph- 
ic and subject information are to be encoded for ma- 
chine searching. 

The documentary information fed into the com- 
puter memory is called the document record 
in this system. It consists of two sections: a 
bibliographic facts section and descriptor sec- 
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tion. The bibliographic section contains sev- 
enteen facts about the document, such as iden- 
tifying and locating number, the author or au- 
thors, the publisher, the date of publication, 
etc. The descriptor section of the document 
record contains an average of thirty descrip- 
tors per document to define the contents of the 
document (296). 

The document is identified with a serial number. Infor- 
mation about one document is kept together and unless 
a division by subject or other dividing criteria is con- 
templated (this is not mentioned) the entire file has to 
be scanned for each search. Descriptors for each doc- 
ument are divided into arbitrary groups so as to re- 
duce the incidence of false descriptor coordination, i. e. 
the coordination of unrelated descriptors in a document 
(297). 

A special assembly of card punch equipment and 
a number of remotely controlled typewriters will be 
used to produce in one operation all the cards for the 
auxiliary card file and the necessary control records. 
The auxiliary card file is a simplified card catalog 
since the author, S. R. Moyer, feels that searches for 
certain types of information e.g. searches for a par- 
ticular document, author searches, and searches under 
broad classesare still done more efficiently in the 
old-fashioned way (298). The necessary control records 
are accession records, the shelf list, and serials hold- 
ings. 

Information to be searched by the IBM 705 is en- 
tered on magnetic tape from the punched card record. 
Each 2, 500 foot reel of tape will hold the bibliographic 
facts and descriptors of 14, 000 documents. The ma- 
chine can search at the rate of 150,000 documents per 
hour. The results of the search will be delivered by 
the IBM 705 printer in the form of a list of identifying 
and locating numbers for the documents. The printer 
can also print the bibliographic facts for each document 
(299). 

The cost of the system is estimated. The rental 
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of the IBM 705 is $28,180.00 per month. The cost of 
personnel and material and the machine rental is est- 
imated to be $41, 180. 00 per month. 

Under the system outlined, the machinery 
could search, barring breakdown and excluding 
multiple searching, 3.15 million documents 
every 24 hours (21 hours of work). If the 
machine were operated 20 days per month, the 
cost would be approximately $653. 65 per search 
of 1, 000, 000 documents, or 18. 9% of the $3450 
necessary to carry out the same operation by 
a completely human agency (300). 

Moyer bases the cost of a manual search on experience 
which he has had searching a collection of 1, 000, 000 
documents. 

The speed of the machine, 150,000 documents an 
hour, is compared with the speed of searching 6, 897 
documents per hour by n a completely human agency n . 

Whereas the electronic computer actually 
searches every document record, the human 
agency searches only in certain areas of the 
card catalog where subject headings, etc. , in- 
dicate that the document containing the required 
information are likely to be listed. Thus, vast 
areas of the catalog are eliminated without ever 
being touched (301). 

Claims: 

The IBM 705 system, staffed by the necessary 
crew, could do a literature search of a docu- 
ment collection in approximately 2% of man- 
hours required by a completely human agency; 
and the results of the search would be more 
accurate, thorough, and complete than the re- 
sults of a search carried out by a human ag- 
ency (302). 

Use: The system is not yet in operation. No 
mention is made of any test with the system. 
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Evaluation: Several points made are open to 
question. Is a machine to replace the card catalog ef- 
ficiently if it has to search through the equivalent of 
the entire card catalog for every search and if only 
one search can be done at one time? Is it bad that a 
human being can skip over large parts of the catalog 
which are of no interest to the search at hand? Are 
searches by machine automatically more accurate, thor- 
ough, and complete? 



Univac 

Univac I and Univac BE are large scale, general 
purpose digital computers manufactured by the Sperry- 
Rand Corporation. Univac n, the later model, has the 
following features: 

Input 

Unityper, a modified electric typewriter which 
records characters directly on magnetic tape 
in binary code 

Punched cards (through card-to-tape converter) 
Magnetic tape 

Memory 

24, 000 characters magnetic core memory 

Output 

Uniprinter, a modified electric typewriter 

actuated by magnetic tape signals 

High speed on-line printer 

Magnetic tape 

Punched tape 

Punched cards (303) 

A study was made to determine the suitability of 
the Univac Fac-Tronic System to literature searching. 
The model studied consisted of a library of 1, 000, 000 
documents. Each document was to be identified with 
an 8 digit serial number. The master reference file, 
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I. e. the index which was to be searched by the ma- 
chine, would include the document serial number and 
other characteristic features of the document. These 
features might be: the author, date, contract number 
in the case of a government report, and subject or sub- 
jects. Each document would have an average of 15 
characteristics encoded with a maximum of 30 charact- 
eristics (304). 

The information would be encoded as numbers and 
letters. A 4 letter digit combination code would rep- 
resent each item of information in a document. In an 
end to end search, the Univac is capable of searching 
a file of 1,000,000 documents averaging 15 approaches 
in approximately 4 hours. As the machine is presently 
constructed only one question can be answered at one 
time. By a simple modification of the circuitry to 
speed up the matching operation 6 to 10 questions can 
be answered in 4 hours. Schemes have been investig- 
ated that would decrease the search time to half an 
hour and increase the number of simultaneous searches 
by several fold. These modifications would, however, 
involve major changes in the design of the computer 
(305). 

A later report on the possible application of the 
Univac for information retrieval was written by O f Connor 
(306). In this report 2 systems which would make use 
of the inverse ordering of document numbers on de- 
scriptors are discussed. An example of the speed of the 
system is indicated with a hypothetical index to 1, 000, 
000 documents. A 6-descriptors search making use of 
logical sum, product and difference can be conducted in 
less than 5 and 12 minutes computer time respectively 
to select and print out the numbers of all the docu- 
ments in the collection which satisfy that retrieval re- 
quest (307). One of the systems requires several inex- 
pensive modifications of the computer. For Univac n, 
a later model of the computer, all storage tape and 
search figures are cut by 50% or more (308). 

Use: No further work was reported on the pro- 
posal "by^Mitchell; O'Connors work seems to be in an 
early experimental stage. 



Retrieval Systems 255 



Herner and Company Special Computer 

A machine for literature searching is now being 

developed by Herner and Company, Washington, D. C. 
Herner states: 

The group has developed and built a small ma- 
chine which records information in digital form 
on magnetic tape, searches the recorded infor- 
mation, performs simple correlations and in- 
dicates those entries that meet specific search 
conditions. Input is by keyboard; output is by 
solenoid-operated electric typewriter. Search 
conditions are specified by setting from one to 
three four digit numbers on rotary switches; 
the required correlation is specified by position- 
ing another rotary switch. The machine then 
follows a fixed internally stored program of 
search, buffer and output operations. The 
basic construction of the machine is now com- 
pleted, and it is being tested in the 'breadbox 1 
stage before being mounted in a console. The 
capacity of the machine is such as to suit it 
for collections of 50, 000 to 100, 000 documents 
with an average of five coded entries each 
(309). 



Lowry and Albrecht Special Computer 

A special purpose computer for information re- 
trieval and its possible application in a large organiza- 
tion is outlined by Lowry and Albrecht in a paper for 
the International Conference on Scientific Information 
(310). They state that: 

Information is available in amounts far greater 
than required by any single research organiza- 
tion and to the degree that it is extraneous to 
needs it constitutes a barrier to research pro- 
gress if attempts are made to handle it in an 
information system (311). 
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Intelligent selection, a carefully defined and applied 
acquisition program, is essential in an information sys- 
tem. Even with such selection, 

in a large organization the amount of informa- 
tion stored will soon exceed the capacity of 
manual or common accepted retrieval methods 
(312). 

A machine system to cope with such a situation and its 
possible application in large organizations is described 
in general terms: 

In most instances, the area of interest of a 
person seeking retrieval of information is not 
fully matched by the scope and setting of infor- 
mation stored and indexed for future reference. 
It is in this situation that the culling process 
achieved through the associations and correla- 
tions of the mind may be employed to advan- 
tage to determine what stored information may 
be of interest, even though it does not fully 
define the user's area of interest. A simple 
technique which approaches a function of the 
mind and which may be readily embodied in 
machine searching of information fields com- 
prises the establishment of two classes of in- 
dexing terms, namely, general and essential 
(313). 

This is similar to the technique used by the U. S. Pat- 
ent Office in their work with the Bendix G--15 D com- 
puter. The technique is called weighting and is dis- 
cussed above (314). The indexing term is specified as 
general or essential at the time a particular search is 
undertaken, not at the time of indexing. 

In the proposed system index entries, which are 
not only single keywords but which can also be indexing 
phrases, names, etc. , are entered with the document 
information on magnetic tape. The following house- 
keeping information is also included: document start 
and document end. The tape contains the index to the 
document in encoded form and is sent through the com- 
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puter during the search. The special purpose computer 
consists of the following parts: 

1. A magnetic tape input device to feed the tape 
into the machine; 

2. Storage blocks for identifier words (search 
terms). These can be included in the input device or 
can be a separate part of the machine; 

3. Comparison circuits to match the indexing 
terms on the magnetic tape with the indexing terms in 
the storage block and to sense start and end of docu- 
ment signals; 

4. Associated circuits programmed to detect es- 
sential and general terms and to send signals to a 
counter; 

5. A counter to count the number of comparisons 
made and to send a signal to the output gate if the de- 
sired matches are made; 

6. An output gate to signal that the document 
identity is to be read into the output device. 

The output device can be a high speed direct printer, 
though this is not likely to be economical. Instead the 
document identity can be coded on another magnetic 
tape. This tape can be used to actuate a print out at 
the end of the search. 

A filled out request form for a machine search is 
illustrated (315). The form contains all identifier words 
(descriptors) needed for the search in alphabetic and 
coded form. It also specifies whether each descriptor 
is general or essential and indicates the number of 
general descriptors that have to be matched. 

Programming the machine is described only in 
general terms. It could be by setting keys and switch- 
es to establish the code for the search terms or it 
could be by punched card or magnetic tape programs. 
One block diagram shows a system with which only one 
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search can be done at a time (316). Another one de- 
picts an expanded version which will permit 3 simultan- 
eous searches. 

In addition to using the system for retrospective 
searching, its use is suggested in a current informa- 
tion announcement system tailor-made for individual 
users 1 interests. 

Use: The system is still in the machine develop- 
ment stage. A simulation of the plans proposed in the 
paper is contemplated on a simulation computer (317). 



3. Magnetic Disc 
IBM 305 RAMAC 

The IBM 305 RAMAC has a large (5, 000, 000 al- 
pha-numeric characters) random access memory which 
makes the machine potentially useful for inverse order 
(document number or descriptor) indexing systems. 
Two literature searching experiments with an IBM 305 
RAMAC are reported in the literature. 

One application of the machine to literature 
searching is at IBM's San Jose Research Laboratory. 
It was organized to gain practical experience in this 
area and to provide a functional service for the labor- 
atory (318), Another IBM publication describes the op- 
eration of the machine (319). 

The IBM RAMAC. . . has a capacity of 5 million 
alpha-numeric characters. It consists of 50 
aluminum discs 24 inches in diameter, mounted 
on a common shaft which revolves at the rate 
of 1200 r.p. m. Both sides of the discs are 
coated with an iron oxide layer upon which 
data may be recorded in the form of magnet- 
ized spots. The movable reading-recording 
arm contains two heads- -one facing the top 
side of the disc and the other facing the 
bottom of the same disc. The arm may be 
moved vertically to any of the 50 discs and 
laterally to any one of the 100 concentric 
recording tracks (320). 

In the experiment, the 5, 000, 000 character memory 
was divided into 50,000 record groups of 100 characters 
each. The memory was also divided into 3 units, with 
the following capacity in terms of record groups: 
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Dictionary: 1,000 record groups. Each diction- 
ary 100 character record group contains up to 6 words 
and their home addresses. The 1, 000 records of the 
dictionary will hold 6, 000 words, each of which is an 

indexing term (descriptor) for the system; 

Home records and overflows: 24, 000 record 
groups^ The home records are locations in the mem- 
ory where document numbers corresponding to the in- 
dexing terms are stored. There are 6, 000 home rec- 
ords, one for each indexing term and 18, 000 overflows. 
Since each home record can only store 12 document 
serial numbers (according to Firth (321)-~17 according 
to Nolan (322) plus housekeeping information, addition- 
al storage has to be provided for the location of sub- 
sequent document numbers. These are stored in the 
overflow, along with the address of the home record 
for the same descriptor. Since space is not preassigned 
to each descriptor (beyond the first 12 serial numbers) 
considerable space is saved in the storage since there 
is no way of predicting how descriptors will be used 
and what the space requirements for each descriptor 
will be. 

Bibliography: 25,000 records. One 100 character 

unit is used per indexed document and the following in- 
formation is included: document number, library call 
number, author, date, and up to 56 characters of the 

title. 

Firth (323) outlines the procedure used in the sys- 
tem. A sequential 5 digit serial number is assigned 
to each document and descriptors are selected for the 
document. Two punched cards are then prepared per 
document: one listing the descriptors and the document 
serial number, the other bibliographic information and 
the document serial number. Descriptors are limited 
to 10 characters because of the need for a fixed field 
length for machine reading. The 2 punched cards per 
document are then read into the RAMAC for storage in 
the memory. 

In addition to the list of descriptors, phrase 
groupings can be created by linking related descriptors 
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(324). This is done by punching the code for a period 
between descriptors which should be linked. It is an 
instruction to the machine to assign a common 2 digit 
number in addition to the 5 digit serial number to the 
descriptors. Word sequence within the phrase can also 
be indicated by assigning a sequence digit in addition to 
the grouping digits and document serial number (325). 
This is intended to eliminate the "Venetian blind' T versus 
"blind Venetian" type of false drop but can also be used 
to indicate the sequence of steps in a process as sug- 
gested by the Patent Office group (326). 

In searching, the home addresses of the selected 
descriptors are typed on a punched card; they are then 
read into the RAMAC. Both logical sum and logical 
product searches can be made. Descriptors 1 home ad- 
dresses are listed for logical product searches and are 
listed and separated by a comma for logical sum search- 
es. According to Nolan: 

The search was put into operation as a result 
of information supplied to the machine by 
means of its own input the punched card. No 
human intervention was necessary to set switch- 
es, or wire a control panel or modify a pro- 
gram for the particular search being processed 
(327). 

Up to 10 terms can be matched simultaneously accord- 
ing to Nolan (328); up to 9 terms, according to Firth 
(329), who writes: 

Two basic programs and control panels are re- 
quired for this application. The loading pro- 
gram which contains both dictionary and docu- 
ment loading requires 96 instructions. The 
search program is made up of 89 instructions 
and is applicable to any search without altera- 
tion. Programs are permanently stored in the 
file. This permits a complete change from 
one operation to another by merely changing 
the control panel and reading in a single card 
to withdraw the desired program and put it in- 
to operation (330). 
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Capacity of machine: A 5, 000, 000 character ran- 
dom access memory is very high for computers. Nolan 
points this out when he states that the maximum capac- 
ity of electronic computers is usually in the range of 
50,000 to 100,000 characters (331). In the application 
illustrated by IBM, 12 serial numbers are posted on 
each record with an average of over 11 descriptors per 
document. An index to, and bibliographic information 
for, about 24, 000 documents can be incorporated into 
the system (332). 

Use: No information was given about the use of 

this particular file. 

Another experimental application of the same ma- 
chine is described in a Patent Office Research and De- 
velopment report, Number 14 (333). The group of pat- 
ents selected for this experiment are in the chemical 
polymer art. Subject matter includes properties, func- 
tions of inorganic and organic compounds, and process- 
es involving these compounds. Three levels of descrip- 
tors are used: specific compounds (the most specific 
level) represented by a 3-digit code; specific structural 
fragments of these compounds, represented by a 2-digit 
code; and mutual attributes of the fragments, represent- 
ed by a 1 -digit code. The authors believe that more 
levels of search can be provided in order to encompass 
a more extensive or elaborate hierarchy (334). 

All but the most generic level of terms are gen- 
erated as needed from the actual terms used in the 
patents so that no pre-established dictionary is re- 
quired for these terms. Interrelations among descrip- 
tors are indicated by adding one or more arbitrarily 
assigned digits at the end of the document numbers for 
pertinent descriptors. A computer program has been 
developed to recognize the symbols for the three levels 
of terms and their relationship. The computing opera- 
tion is one of matching for logical product relationships 
and merging for logical sum relationships. Specifically, 
punched cards with the addresses of the portion of the 
files to be searched (descriptor addresses) and the log- 
ical grouping of the descriptors are inserted into the 
RAMAC. The computer will then seek out the sets of 
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data in its file corresponding to these addresses and 
perform a succession of merging, matching, and re- 
seeking operations until it arrives at the numbers of 
the documents satisfying the search requirements (335). 

The document numbers can then be printed out by the 
machine. 

One of the yet unsolved problems of the system 
is mentioned: 

Where it is required to find a process contain- 
ing A plus B and another process containing C 
plus D, it is not yet possible to avoid the in- 
valid answer A plus B plus C plus D, all in 
the same process. Similarly, a fragment an- 
swering two separate sets of descriptors will 
respond as an answer to both (336). 

Use: The system seems to represent one of a 
continuing set of experiments on the use of machines 
for searching operations at the U. S. Patent Office. 

No information is given on the use of the system. 

Evaluation: The promising feature of the RAMAC 
is its large random access memory. This feature 
makes possible the use of the machine in an inverted 
order of storage type systems where relationships a- 
mong descriptors may be indicated. The inverted order 
of storage eliminates the need for an end to end search. 
This was the principal advantage of the "Uniterm" and 
n Peek-a-Boo tf systems, but something had to be traded 
to gain this advantage: the possibility of indicating re- 
lationships among descriptors in any but very elemen- 
tary f ashion The RAMAC system permits both advan- 
tages, but at a cost. The rental price of $3, 000 
plus per month is high, not when compared to the large 
general purpose computers but certainly when compared 
to the cost of catalogers, and the capacity of the ma- 
chine is relatively low in terms of documents included 
in the system. In the IBM application the index to, 
and bibliographic data for, 24, 000 documents could be 
stored on one 5, 000, 000 character memory unit* If 
the bibliographic information were deleted, that is, if 
the answer to a search were only to be a document 
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number, the capacity of the machine would be doubled 
but this would still mean only 48, 000 documents. Ad- 
ditional memory units can be incorporated into the sys- 
tem at a cost of $700 plus per month. It would 
seem that the RAMAC as it now stands is too costly 
for use in small collections and does not have the ca- 
pacity for large collections. 



HI. Data Files 
A. Dow Chemical Company 

A file of coded chemical structures is stored on 
magnetic tape and searched with the IBM 704 at the 
Dow Chemical Company. Opler and Norton started de- 
velopment work on this project in 1952 (337). Punched 
card scanning equipment was considered impractical be- 
cause complexities introduced by starting a search any- 
where and continuing a search in any direction all in- 
volved requirements complicating the "untangling" of 
multiple substituents. Such substituents might be, for 
example, 4 chlorine atoms, 2 ortho to each other on a 
phenol and at least one of the others on a phenyl group 
attached to the phenol (338). This example illustrates 
a type of search where descriptors in this particular 
case structural units cannot merely be listed but have 
to be listed and the relationship among parts brought 
out. 

In 1950, the chemical code consisted of 332 struc- 
ture units (larger than a single atom but not as large 
as a "complex" group) (339). The structure units and 
the relationships among these units, both of which are 
reduced to a sequence of one and 3 digit numbers re- 
spectively, make up the code for chemical structures. 
The structure of a chemical is translated into groups 
of 7 digits which have the following meaning: 

The first digit gives the location in some other 
group (B) to which group (A) is attached to 
group (B). The second digit tells by which of 
its positions the group (A) is attached to group 
(B). The third, fourth, and fifth digits desig- 
nate the group number (001-099) [the structure 
unit] assigned to group (A). The sixth digit is 
the Identifying number of the group (B) to which 
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group (A) is attached. The seventh digit is the 
identifying number of the now coded group (A) 
(340). 

Each chemical compound is identified by a serial 
number. The serial number and the molecular formula 
of each chemical compound are also encoded. The code 
is punched onto punched cards and converted to magnet- 
ic tape by means of a specially prepared program. 
Coded data is grouped in the following way on the tape: 

Compound block: the code for the serial number, 
molecular formula, and structure of a chem- 
ical; 

Block: 100 to 300 compound blocks or chemical 
compounds; 

Reel: approximately 100 blocks. 

Consequently, each reel of magnetic tape will contain 
10,000 to 30,000 encoded chemicals (341). 

A general search program has been prepared 
which, when modified by search request cards, can be 
used to handle 90% of the searches (342). The program 
consists of some 2, 000 computer instructions, contained 
in a deck of approximately 100 binary punched cards 
(343). When searching for a particular structure, 
agreement is checked for on five criteria. These are: 

1. Empirical formula; 

2. Presence of structure units; 

3. Indirect connection among structure units 

through a third structure group; 

4. If f 3 applies, tests are made for positional 

differences between points of attached pairs 
of structure units; 

5. Direct connections between structural units. 

If a compound fails to meet any of the criteria, it is 
rejected then and there, and another compound is 
searched (344). 

The results of the search are serial numbers of 
selected chemicals. An alternative to this is the print 
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out of names of chemicals corresponding to the serial 
numbers. These are found by searching a special tape, 
A new technique has been developed in which the search 
results, that is, the chemical structures, are displayed 
pictorially on an oscilloscope. The structures are pho- 
tographed so that they can be inspected at leisure (345). 
Chemical structures are made up of repeating patterns 
of letters, numbers, and simple geometrical designs. 
These are depicted as basic units including simple com- 
binations of lines on a 64-dot square. The selected 
dots are thrown on the screen in succession but so 
quickly that the design appears instantaneously. Punched 
card codes consisting of 2-letter or number combina- 
tions were worked out to yield on demand any desired 
symbol in the 64-dot square (346). 

A method for conducting up to 5 searches simul- 
taneously, called multiplexing, has also been developed 
(347). In 1956, coding experience was obtained on 
15,000 compounds (348). No later figures are given. 
Search speeds in the order of 10, 000 structure compar- 
isons a minute are obtained (349). A comparison was 
made between the accuracy of human and machine 
searchingdisregarding speed. Searches for chemicals 
with certain specified structural features were made 
with the following results: 

Human Machine 
Correct retrieval 

Completely characterized 82 92 
Partially indeterminate but 

known to be pertinent 15 15 
Partially indeterminate with 

relative positions not known 26 

Incorrect retrieval (fall- safe) 2 30 

Failed to retrieve" 13 5 

From this experiment Opler concludes: 

Any mistakes the computer makes arise from 
human error in original coding or cause re- 
trieval of compounds that do not fit the search 
criteria. No case of random failure has been 
detected to date We conclude that search 
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speed is reasonably high and accuracy is en- 
tirely satisfactory (350). 

Use: No data is given on the use of the system. 



B. Enjay Laboratories 

At the Enjay Laboratories, butyl rubber compound- 
ing data is recorded on punched cards for searching on 
an IBM 101. Each tested compound is given an ident- 
ifying serial number and data about the compound is re- 
corded on 4 separate forms (351). The forms have 
been set up to correspond exactly to the layout of the 
IBM card so that the information can be directly 
punched onto the card. The individual forms contain 
the following information: 

Cure and stress and strain data 

Physical tests other than stress and strain data 

Aging test 

Electrical tests 

The completed form sheets, called data sheets, are 
sent to the data processing clerk for card punching. 
One card is made out for each data sheet so that a 
maximum of 4 cards is made out per compound. The 
data sheets are microcarded and serve as the perma- 
nent records of the tests. Eight thousand cards, rep- 
resenting about 3, 500 compounds, were included in the 
file by 1957 (352). 

The most important application of the file is that 
of searching for compounds possessing a combination of 
specified properties. The IBM 101 plug board is wired 
for the code of these properties. The individual decks 
(up to 4 if each type of property is being compared) 
are sorted through the machine at the rate of 450 cards 
per minute. The machine is programmed to print out 
serial numbers of pertinent compounds. Numbers on 
separate lists are matched manually. 

Another application of the file is the preparation 
of lists of compounds arranged by ascending numerical 
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values of a particular property. 

C. Federal Telecommunications Laboratories 

The Electronic Spectroanalyzer developed by the 
Federal Telecommunications Laboratories combines a 
spectrophotometer and a digital computer (353). The 
instrument consists of 4 units: 

A spectrophotometer to emit infrared rays and 
measure absorption 

A recording device to encode the spectral data in 
numerical form on paper or magnetic tape 

A reference "library* 1 which contains the infrared 
absorptions of possible constituents in nu- 
merical form on digital tape 

A high speed digital computer to do the mathemat- 
ical calculations and record the analysis 
directly 

The entire spectrum of a chemical mixture of up 
to 10 chemicals is analyzed in the spectrophotometer. 
The instrument records the mixture spectrum and con- 
verts the entire spectrum into numerical form on paper 
or magnetic tape. Each "library" spectrum is next 
compared to the unknown mixture by multiplying the 
"library" tape by a specimen tape at each wave length. 
The products are added together and used as coefficients 
in linear simultaneous equations. The answers confirm 
the presence or absence of constituents as well as the 
quantity of these constituents present (354). The form 
of the answer is not indicated. 

The first commercial model of the Electronic 
Spectroanalyzer was scheduled to go to the Sloan-Ketter- 
ing Institute in the fall of 1958 (355). 

D. Midwest Research Institute 

Data on chemical compounds is entered on mag- 
netic tape for IBM 704 searching at the Midwest Re- 
search Institute in Kansas City, Missouri. The object- 
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ive of this program is the determination of new uses 
for chemical compounds by a correlation of chemical 
and physical characteristics with known uses (356). One 
tape is to be used for each of the following character- 
istics of chemicals: 

Physical properties 

Usage 

Name 

Chemical structure 

Each chemical is identified by an arbitrarily assigned 
serial number on each of the tapes. Since there is on- 
ly a total of approximately 1,875 chemical compounds 
in the system thus far, all 4 files are entered on one 
tape. During a search this data is transferred onto 4 
working tapes (357). 

The physical property data includes the empirical 
formula of the chemical compound and the physical con- 
stant for the melting point, boiling point, refractive in- 
dex, density, and molecular weight. The physical con- 
stants are stored as a 2-decimal digit code. The em- 
pirical formula consists of the number of atoms in each 
of the 10 more commonly found elements in a special 
decimal-digit format. 

Known uses for chemical compounds are coded as 
6-decimal digit codes. Structure data is stored accord- 
ing to a modification of the code devised by Norton and 
Opler of the Dow Chemical Company. This code has 
been treated in an earlier part of the present section of 
this report. 

Searches are programmed by punched cards. Ten 
sets of search criteria (10 descriptors from the physi- 
cal property, usage, and structure unit categories) can 
be searched for simultaneously. Each set of test cri- 
teria is entered on one punched card as a series of 6- 
character words giving the category of the descriptor 
(physical property, usage, etc. ) and the code for the 
descriptor as well as print out instructions (358), In 
a search for 10 descriptors, the machine can be pro- 
grammed to list separately also chemical compounds 



Retrieval Systems 271 

which satisfy less than 10 descriptors (359). An on- 
line printer- -a printer that is directly connected with 
the computer --can be used for the output data; but In- 
stitute personnel feel that off-line printer operations 
are more efficient and hence make use of this mode of 
operation (360). 

Use: The system was demonstrated at the Inter- 
national Conference on Scientific Information in Wash- 
ington, D. C. , in November, 1958, but no data has 
been given on its use. 



E. Monsanto Chemical Company 

Results of screening tests for chemicals, i. e. 
preliminary tests of the usefulness of a chemical for 
a particular application, are entered in a form which 

lends itself to searching and reproduction by an IBM 
702 computer at the Monsanto Chemical Company. 

A central data group receives all samples to be 
screened. This group assigns a 5 digit serial number 
to each sample and prepares the following records on 
punched cards: 

Compound structure and identifying number 
Compound molecular formula and identifying num- 
ber (in later work the molecular formula 
record has been combined with the struc- 
ture record) (361) 
Compound name and identifying number (362) 

The sample is then sent to the laboratory for testing. 
The test results are recorded by the scientists on note- 
book pages containing 80 columns. The results of a 
single test, including conclusions, usually occupy a 
single line- That is, they can be entered on an 80 
column punched card. The first 12 columns are used 
for identification and contain the compound number, 
place of synthesis, date of testing, and field of applica- 
tion under test. Column 13 is used for one of the fol- 
lowing general conclusions: 
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No further interest 

Some activity, a lead for further study 
Active, needs secondary screening or a field test 
(363) 

The actual test re.sults are also coded on some 
of the remaining columns. A carbon copy of the note- 
book page is sent to the tabulating department and the 
test result as well as the compound structure, com- 
pound name, and compound molecular formula, are en- 
tered on separated reels of magnetic tape. 

The chemical structure code is a very simple 
one. It consists of only 20 structural units standing 
for chemical bonds and chemical elements some single, 
some groups of elements. The encoding of a structure 
consists of rewriting the structure on cross-hatched pa- 
per. The elements and bonds are placed in the squares 
of the cross-hatched paper to compare exactly with the 
original chemist-written structure (364). The rules 
prepared so far are sufficient to handle the great ma- 
jority of compounds, including many steroids. There 
is a relatively large number of compounds which re- 
quire special handling. 

The structure of each compound, the identifying 
number, and the molecular formula are punched onto 
punched cards. The first 5 columns of the card are 
punched with the compound identification number. Col- 
umn 6 is a sequence control position for card number- 
ing; the balance of the card is for structure data stor- 
age. A maximum of 10 cards may be used for each 
record, allowing storage of 740 characters of structure 
data. The instructions for the computer are stored 
permanently in a small deck of punched cards. All the 
data cards which are added to the structure file are en- 
tered in the computer along with this instruction deck. 
The group of cards which stand for the structure of a 
compound is transcribed through the computer onto a 
reel of magnetic tape in the form of a variable length 
record. Since the record size may vary from 45 to 
over 700 characters, the variable length feature allows 
maximum reading and writing speed in computer opera- 
tions (365). The computer calculates the molecular 
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formula from the structure and inserts it in the proper 
place on the record (366). The computer also checks 
the accuracy of the input data by seeing, for example, 
that any atom does not have too few or too many bonds 
(367). One reel of tape may hold over 25, 000 typical 
structure-molecular formula records (368). 

In searching for a particular chemical structure, 
the exact structure specifications are transferred on 
IBM cards. The searching procedure is outlined by 
the authors: 

The program deck, containing all the instruc- 
tions necessary to perform any type of search, 
is entered into the computer with control cards 
necessary to perform the specific search. We 
have available for high-speed scanning the mo- 
lecular formula file, which is part of the struc- 
ture record. As each structure is considered, 
the molecular formula is compared with the 
T molecular formula 1 of the substructure or 
moiety sought. If the substructure contains 
strange elements, the structure being compared 
is rejected by the machine. Furthermore, if 
the search requirements demand a greater num- 
ber of any given element than is presented in 
the particular stored structure, it will be re- 
jected by the computer. If the desired molec- 
ular values are available in a structure, then 
the detailed search begins. With the aid of 
the control cards entered with the search pro- 
gram, the computer gets its ideas as to wheth- 
er certain mutations of the substructure de- 
sired should be considered Depending on 

the nature of the search question, and control 
of information given the computer^ the search 
continues until a match is found, or until the 
whole structure has been scanned, no equality 
found, and the structure rejected (366). 

Something under 20, 000 compounds are included in the 
system (370). 

Use: In addition to making searches for chem- 
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ical structures, the system is used to prepare monthly 
summaries of screening reports (371). It is also used 
in searches for compounds tested under specified con- 
ditions (372). 



F. Thermophysicai Properties Research Center 

The Thermophysicai Properties Research Center 
(TPRC) is an industrially sponsored project established 
in 1957 to organize thermophysical data found in the 
literature, to conduct searches on this data, and to 
publish an annual volume of all the data in the files. 
About 20, 000 references are to be coded annually (373). 

The thermophysical properties and bibliographic 
citations to these properties are encoded for searching 
and processing on a Datatron Electronic Computer and 
auxiliary equipment (374). The Datatron is a medium- 
sized digital computer with a permanent magnetic drum 
memory of 4, 000 machine words. A machine word in 
this particular case consists of 10 arithmetic digits and 
a plus or minus sign. In this installation the computer 
is equipped with 2 magnetic tape units, each with one 
tape. Each tape has a capacity of 400, 000 machine 
words. Auxiliary equipment consists of an Electrodata 
500 card-to-tape converter, IBM card punches, veri- 
fiers, sorters, reproducers, and tabulators (375). 

Recorded information on thermophysical properties 
is selected from the following sources: 

1. Abstract journals 

2. Government and industrial research reports 

3. Reports of private research institutions and 

universities, including theses 

4. Major research centers throughout the world 

with which information exchange agreements 
have been established 

5. Special collections, reference books, and com- 

pendia (376). 

Thirteen items of information are used to charact- 
erize each reference. One item is a 7-digit identifying 
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serial number. Six items are for the bibliographic ci- 
tation, one item is to indicate the language of the orig- 
inal document, and 5 items are used for the following 
subject information: 

1. Properties: a 2-digit code field is used nu- 
merically to encode up to 99 properties; 

2. Substance class: a 3 -digit code field is used 
numerically to encode up to- 999 substance classes; 

3. Substance name: a 4 -digit code field is used 
to encode up to 9, 999 substances within each class. 
The combination of substance class and substance name 
allows the encoding of a possible 100, 000, 000 names; 

4. Physical state: one digit; 

5. Type of subject coverage: one digit is used 
to indicate the nature of the treatment given in a ref- 
erence (experimental, theoretical^ property values^ in 
various combinations) (377) . 

The encoded information for a reference is punched on- 
to the first 40 columns of an IBM card. 

The first 9 columns of the IBM card contain coded 
information on property^ class, and substance. By sort- 
ing sets of punched cards which contain newly coded 
data by the first 9 columns, an ordered arrangement 
by property, by class within property, and by substance 
within class is obtained. If this set of data constitutes 
the initial storage for any one property of a substance, 
the information is stored in this order on magnetic tape. 
If the data represents new sets of information^ i. e. in- 
formation on the specific properties of a substance al- 
ready in storage, the new data is stored in its proper, 
ordered location. A system of inter filing- -sometimes 
called "banker sorting" is used which maintains both a 
completely filled (packed) and ordered tape^ thus making 
maximum use of available space on the tape. 

An empty tape is placed on one of the magnet- 
ic tape units and the active tape (the tape con- 
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taining the previously stored information) on 
the other tape unit. The IBM cards containing 
the new information to be stored are read into 
the memory of the computer. The information 
from the active property tape is then copied in- 
to the previously empty tape until a location is 
reached where an insertion of new information 
is appropriate. This new information that has 
just been read into the memory of the comput- 
er is copied onto the previously empty tape. 
When all the appropriate insertions are made 
at this point, copying begins again from the 
original active tape. This process of alternate 
copying from tape unit to memory is continued 
until all new information has been interfiled in 
its proper location. The result is a new or- 
dered and packed tape (378). 

The file is searched by specifying the search re- 
quirements in punched card code form. The search re- 
quirements are entered into the computer and stored in 
its magnetic drum memory. The appropriate program 
tape is inserted into the computer and the retrieval pro- 
gram is activated. The area of the tape where the 
pertinent references are located (if the search can be 
so circumscribed) is located by means of an address 
directory at the head of the magnetic tape. This area 
of the tape or the entire tape is then matched against 
the search requirements which are stored in the com- 
puter's memory. When pertinent items are located, 
the serial number of the entry together with the other 
descriptive code numbers of the bibliographic citation 
are punched out on IBM cards by the output device of 
the computer. These cards are then fed into an IBM 
tabulator which prints the information coded on the 
punched cards (379). 

In addition to searching for the thermophysical 
properties of a single chemical compound or a class of 
chemical compounds the number of searches is not 
specified- -the information contained in the system is 
compiled annually in printed form and sent to the sys- 
tem's subscribers (380). This information is issued in 
3 parts. Part A is a classified list of substances with 
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code numbers of the properties listed for each sub- 
stance and the code number of the substance. Part B 
is a list of properties, subarranged by classified sub- 
stances in coded form. Also included in part B are 
the physical state, the type of subject coverage, the 
language, the year of publication, and the identifying 
serial number of the publication. Parts A and B are 
to be cumulated annually. Part C is a listing of refer- 
ences by identifying serial number and an author index 
to this list. 



G. Electronic Structure Correlator 
(No company association) 

A special purpose computer has been proposed to 
search organic chemical structures. Organic chemical 
structures are translated into a notation called the 
Gordon-Kendall-Davison (GKD) notation, in which struc- 
tural formulae are treated as topological systems of 
atoms connected by links, without considering bond 
multiplicities. Atoms are represented by their normal 
chemical symbols except that CHg, CH2 and CH groups 
are symbolized as J, L, and M respectively. All sym- 
bols are given an order of seniority. The senior atom- 
ic symbol is ciphered first; the senior symbol linked to 
the first cipher is linked next, and so on. A ring clos- 
ure is indicated by an X (381). 

The Electronic Structure Correlator which is to 
be used for the structure searching operation is a high 
speed, sequence-controlled electronic digital computer 
coupled to punched card reading and sorting equipment. 
The chemical structure is coded on punched cards. 
Each cipher is treated as a 12 digit binary number di- 
vided into 3 fields and occupying a single column on a 
standard punched card. One hole is punched if an atom- 
ic symbol is represented. A second field of 4 holes 
represents the coordination number (the number of ci- 
phers to which this cipher is linked) and a third field 
of seven holes represents the atomic number. Carbon 
groups J, My L are given pseudo atomic numbers (382). 
A link table- -a table which lists the connections among 
ciphers is prepared by machine from the structure 
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data in punched card form. This link table is stored in 
the computers memory during a search. The machine 
scans the data for coexistence of structure elements 
(the ciphers) and then checks whether these ciphers are 
connected in a predetermined way. 

The Electronic Structure Correlator will have 6 
main memory units with a capacity of 80 symbols and 
6 smaller memory units of 1-symbol capacity, together 
with 4 separate adders and the usual collection of 
n gates ? M "trigger controls, " and auxiliary units. It 
will be considerably smaller than a large scale general 
purpose computer (383). 

Use: There is no indication that any work has 
been done on the construction of any equipment men- 
tioned in this paper. 



IV. Miscellaneous Applications 

A. Computer Preparation of Manual 
Coordinate Indices 



Tabledex 



A coordinate index which is to be partially pre- 
pared by a computer is suggested by Ledley (384). This 
coordinate index will have the following parts: 

Part I: The bibliography proper, or list of arti- 
cles with citations and, if desired, some descriptive 
material in addition to the citation. 

Part n: An alphabetic list of descriptive words 
(descriptors) by means of which the retrieval is ac- 
complished. 

Part HI: The indexing tables. 

The tables which are illustrated below contain 2 
types of numbers, the underlined article numbers down 
the left column and the non-underlined word numbers 
(descriptors) comprising the rows. There is one table 
for each distinct word of the word (descriptor) list and 
each table Is numbered with this word number (385). 

Table 5.1 (386) 
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The tables are used as in other non- manipulative 
coordinate indices. The word (descriptor) list is first 
checked for the identifying number of each descriptor. 
The table for the smallest number is then selected and 
the row which contains all the other given descriptor 
numbers is checked. The underlined article numbers 
for rows in which all the specified descriptor numbers 
are included represent articles of potential interest 
(387). Instead of translating the descriptors into num- 
bers, the descriptors themselves can be used in the in- 
dex tables (388). 

The coordinate index preparation consists of the 
following steps. The indexer selects descriptors from 
each article. The bibliographic citation, additional in- 
formation, and descriptors are translated into a form 
which a computer can handle, i. e. punched card and 
conversion to magnetic or paper tape. The computer 
will automatically assign both article numbers and des- 
criptor numbers, if these are used, and will also form 
all tables. The output from the computer will be the 
completed bibliography, Parts I, n, and HE, printed 
even with the correct page formats, ready for photo- 
offset duplication and binding into books. All this can 
be accomplished by the computer within a few hours 
(389). 

No details of the computer program are given. 
There is no indication that any experimental work has 
been done on this system. 

The size of a Tabledex for 10, 000 articles, 6, 000 
words (3, 000 words with an average of 2 synonyms per 
word) and an average of 10 descriptors associated with 
each article, is estimated to be about 300 pages, if the 
print is similar to that of Webster's New Collegiate 
Dictionary, 1953 (390). 

Univac 

A computer-produced non-mechanized coordinate 
indexing system is proposed by O'Connor (391). The 
system is similar to that used by Batten (392), and 
this resemblance is acknowledged by calling the basic 
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units Batten cards (393). O'Connor suggests a card 
with a larger capacity than the one used by Batten. A 
card with 1, 000 positions (100 columns and 10 rows) is 
proposed with the standard IBM card as a second alter- 
native if the large card is impractical to mass produce 
(394). As in the Batten system, one card would be 
made out per descriptor. Each document would be 
identified by a serial number. The serial number 
would also stand for a position on the Batten (descript- 
or) card. This position would be punched on all Batten 
cards which applied to the indexed document. A search 
with this system would consist of selecting pertinent 
Batten cards, superimposing them and identifying the 
punched positions shared by all the selected Batten 
cards. The position numbers so identified would stand 
for serial numbers of potentially pertinent documents. 

The computer -produced Batten cards, in this sys- 
tem, are supplemented with a computer -produced des- 
criptive directory. This is an alphabetic list of des- 
criptors, their code numbers (used instead of spelling 
them out, to save space on the Batten card), and the 
approximate number of documents entered under each 
descriptor. A document accession list is also part of 
the directory (394). 

Both the Batten cards and the descriptive direct- 
ory can be produced by a computer. Univac, for ex- 
ample, can produce magnetic tapes for a tape fed print- 
er for the directory and for a tape-to-card converter 
for the Batten cards (395). Multiple copies can be 
made by gang-punching and print-copying devices re- 
spectively. 

A hypothetical file is described to illustrate the 
speed of retrieval. This is a file of 100, 000 docu- - 
ments which is indexed with a vocabulary of 5, 000 
descriptors. The index is entered on 50, 000 punched 
cards with 1,000 one-line entries, each of 100 digit 
length. A search for 6 descriptors, making use of a 
logical sum, product, and difference, can be conducted 
in about one hour if each descriptor applies to about 
1% of the collection and if the answer consists of about 
20 documents. Searching the same file for a 4 des- 
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criptor logical product search requires only about 17 
minutes (396). 

The use of this system for the preparation of an- 
nual subject indices to technical journals is suggested 
It could also be used to supplement mechanized sys- 
tems so that the machine would only be used for the 
more difficult searches (397). 

This work is still in the experimental stage. 



B. Literary Data Processing 
Preparation of Concordances by Computers 

Data processing machines such as punched card 
sorters, computers and auxiliary equipment are now 
being used to assist the scholar in linguistic and con- 
ceptual analysis. Data processing techniques have been 
used for this purpose on the Summa Theologica of St. 
Thomas Aquinas (398, 399), part of the Dead Sea Scrolls 
(400,401)5 and the Bible (402) In literary data pro- 
cessing the text is machine analyzed and indexed down 
to the simplest meaningful element, the word (403). 
The words are then compiled with the assistance of 
the scholarin a number of specialized lists such as 
the following: 

1. An alphabetic list of all the words in 
the text as many times as they appear, along 
with the identifying data. 

2. A list of all graphically different words 
(house and houses would appear separately in 
such a list), along with the number of times 
each word appears in the text and the alpha- 
betic position which it occupies among all the 
words. 

3. A list of all word families as defined 
by the scholar, who also examines all the dif- 
ferent words and who groups the various 
forms of each graphic -semantic family under 



Retrieval Systems 283 

a single parent expression or word. For ex- 
ample^ the words were 3 are, be would be rep- 
resented by the identity listingTo be. Homo- 
graphs are also separated in this list. 

4. A lexicon which combines the features 
of the second and third lists. The lexicon will 
show the parent words^ related component 
terms, frequency of each word's appearance, 
and the sum of frequency counts of all the 
words in the family. 

5. A reverse index which combines the fea- 
tures of the first list and the third list. This 
list will show parent words, list related com- 
ponent words as many times as they appear 3 
and show by reference number the location of 
each appearance by the word endings. 

6. A concordance which will have for each 
word in the text one or several lines of text in 
which the word appears (404) . 

The machine procedure for the literary analysis 
of the Summa Theologica is outlined by Tasman: 

1. The scholar analyzes the text, marking 
it with precise instructions for card punching. 

2* A clerk copies the text using a special 
typewriter which operates a card punch. This 
typewriter has a keyboard similar to that of a 
conventional typewriter and produces the phrase 
cards. These cards contain all the lines or 
phrases of the text, one on each card^ in se- 
quence, transcribed in symbols (punched holes) 
that can be understood by the machine. Each 
phrase is preceded by the reference to the 
place where this line is found and provided by 
a serial number and a special reference sign. 
A second clerk types each phrase of the text 
on the appropriate phrase card which had al- 
ready been punched using a checking machine. 
In this way the accuracy of the text cards is 
rigorously checked and cards containing trans- 
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cription errors are replaced. 

3. From the phrase cards the machine au- 
tomatically produces the word cards and a 
complete copy of the text, phrase by phrase. 
Each of these word cards contains only a sing- 
le word of the text. . . accompanied automatical- 
ly by identifying data (405). 

The identifying data for each word includes: 

a. The quotation of the place where the word 
is found. 

b. The first letter of the preceding word and 
the first letter of the following word. 

c. The number indicating the position which 
this word holds in the text, e.g. the 121st or 253rd 
word. 

d. A special reference mark which character- 
izes the phrase to which the word belongs, e. g. "Here 
St, Thomas refers to another passage of his own work. T! 

The context of the same word is printed on the 
reverse side of each word card. The horizontal space 
between punched lines is used and a maximum of 12 
lines or an average of 100-120 words can be transcribed 
(406). 

The machine prepares lists of different words on 
separate cards from the word cards. These cards are 
called form cards and contain in addition to the word 
the number of times it is found in the text and a num- 
ber which indicates its alphabetic sequence in this list 
of words. This is done as follows: 

a. The machine puts all word cards in alpha- 
betic order. 

b. The machine prints on sheets of paper the 
different words which it finds by examining all 
the word cards at a rate of about 6, 000 cards 
per hour. It prints the first word. If the fol- 
lowing word is different from it, it prints it. 
If it is identical, it does not print it, but 
counts it. After it has finished counting all 
identical words, it prints the total number next 
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to the printed word. It proceeds then to print 

the following different words and so forth 

The machine punches at the same time another 
set of cards, one card for each different word 
(407). 

The form cards are then analyzed by the scholar 
who separates homographs and combines words of one 
and the same graphic -semantic unit. The sum total of 
these words becomes the entry words. A card is 
punched for each entry word and is numbered sequent- 
ially. The form cards and word cards are then grouped 
under their respective entry card. A second code num- 
ber (identifying the entry word) is automatically punched 
in all word and form cards. At the same time the ma- 
chine adds to the entry card an identification of the to- 
tal number of times in which it occurs in the text in 
one form or another. Lists such as those indicated 
can now be printed from these cards without any furth- 
er effort on the part of the scholar. The printing speed 
ranges from 4, 800 to 60, 000 lines per hour depending 
on the type of machine used (408). 

While conventional punched card machines are 
being used for the literary analysis of the Summa Theo- 
logica, a large scale computer, the IBM 705, is being 
used for such analysis of the Dead Sea Scrolls (409). 
The procedure appears to be similar to the all-punched- 
card technique except that the data is transcribed onto 
magnetic tape and processed more speedily. 

The two principal advantages of machine technique 
compared with conventional literary analysis are greater 
speed and higher accuracy. The comparative speeds of 
the different techniques for indexing the approximately 
13, 000, 000 of St. Thomas' complete words are estimat- 
ed as follows: 

Manual index: 50 scholars 40 man 

years 2, 000 man years 

Punched card machines: 10 schol- 
ars 4 man years 40 man years 
Large scale data processing equip- 
ment, e. g. IBM 705: 10 scholars 
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less than 1 man year 10 man years (410) 

C . Autoabstracting 

Autoabstracting is the selection by a machine of 
key sentences in a document and the reproduction of 
these sentences in their order of appearance in the or- 
iginal document. This work is described in 3 publica- 
tions by Luhn (411), Savage (412), and Rowe (413), all 
workers at the International Business Machines Corpora- 
tion. 

In order to select the sentences in a document 
which are most suitable for the autoabstract, a meas- 
ure of information count of each sentence has to be de- 
termined. This is called the "significance factor n of 
the sentence (414). The significance factor is based on 
the usage and position of significant smaller units in 
the sentence, that is, words. Significant words are de- 
termined by making a list of all of the words in a doc- 
ument and arranging them in descending order of use. 
This list of words is refined by eliminating types of 
words which have little discriminating power, e. g a con- 
junctions, articles, prepositions, and by combining 
words with common roots s e* g. differ, differentiate, 
different, differently, difference, differential, and con- 
sidering these as one word. From this list of words 
significant words are chosen on the basis of frequency 
of usage in a particular document. In an illustrated 
autoabstract of a 2, 326 word document, 39 words which 
occurred 5 or more times in the document were select- 
ed as significant words (415)* 

The position of these significant words in the sen- 
tences is next considered. Luhn f s argument is: 

Whatever the topic, the closer certain words 
are associated, the more specifically an aspect 
of the subject is being treated. Therefore, 
wherever the greatest number of frequently oc- 
curring words are found in greatest physical 
proximity to each other, the probability is very 
high that the information being conveyed is 
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most representative of the article (416). 

The significance factor of a sentence is derived 
from the number of significant words within the sen- 
tence and the linear distance between them due to the 
intervention of non-significant words, i. e. the proxim- 
ity of significant words. An analysis of many docu- 
ments has indicated that a useful limit is 4 or 5 non- 
significant words between significant words (417). All 
the sentences are ranked in order of significance factor 
and one or several of the highest ranking sentences may 
then be selected to serve as the autoabstract. 

An experiment with autoabstracting is described 
by Savage (418). The document which is to be autoab- 
stracted has first to be converted to a language which 
the machine understands. In the case of the IBM 704, 
this will have to be magnetized spots on magnetic tape. 
In order to save the time-consuming step of first trans- 
lating the printed document word by word onto punched 
cards, a by-product of an earlier operation was used. 
This is the 31-channel punched paper, tape which is pre- 
pared for the f! Monotype ff type casting operation. This 
tape was converted onto punched cards by means of a 
special tape and card converter. The cards were then 
converted onto magnetic tape by means of the standard 
card-to-magnetic-tape converter. This tape was fed in- 
to the IBM 704 for processing. The document is first 
separated into individual words and sentences, as 2 
separate files. Common words are deleted from the 
list of words by means of a special look- up table. The 
remaining words are then alphabetized. A frequency 
study of the words is made; words with the same stem, 
e. g. different, difference, differently, are treated as 
one. In the sentence file, each sentence and the occur- 
rence of each word are numbered serially from the be- 
ginning of the document. The average sentence length 
and the average word frequency are determined. A list 
of locations of any word which occurred more times 
than the average word frequency is produced. High 
frequency words are traced back to the sentences where- 
in they occurred and their position is noted. The por- 
tions of the sentences bounded by the high frequency 
words are then bracketed, providing that between any 2 
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high frequency words there are no more than 4 non- 
common low-frequency words. If this is not the case, 
3 or more bracketed sections are constructed. The 
sentence is assigned a value corresponding to the square 
of the number of high frequency words, divided by the 
total number of non-common words in the bracketed 
section. The sentences are then sorted according to 
this value and the highest ones are selected for the 
autoabstract (419). 

The method described has been used on about 50 
articles, 300 to 4500 words each. The results have 
been encouraging enough to be further evaluated by a 
psychological experiment involving 100 people (420). 

Luhn concludes: 

The results so far obtained for technical arti- 
cles have indicated the feasibility of automat- 
ically selecting sentences that will indicate the 
general subject matter, very much as do con- 
ventional abstracts. What such autoabstracts 
might lack in sophistication they will more 
than compensate for by their uniformity of 
derivation (421). 



D. Autoencoding 

An indexing system in which the equivalent of in- 
dexing headings are selected by machine has been pro- 
posed by Luhn (422). The name which he assigns to 
this proposed system in another publication (423) is 
autoencoding. The words which serve the function of 
descriptors or index headings are called notions. Each 
subject area will have its own set of notions and as a 
preliminary step a set of notions peculiar to a partic- 
ular field must be collected from an analysis of a rep- 
resentative sample of the collection. This job is done 
partially by machine and partially by human workers. 
The sample documents in a collection are transcribed 
into machinable form, i. e. onto punched cards, punched 
tape, or magnetic tape. Certain classes of words, e. g. 
nouns, and adjective qualifiers, are identified by special 
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symbols since they have to be recognized in subsequent 
steps of this procedure. These words or words with 
similar meanings will eventually become the notions. 
A card index of all the sentences is then prepared. A 
concordance of the words identified with special sym- 
bols is prepared from these cards. The next step is 
the grouping of words of similar or related meanings 
into notional families (424). This work is similar to 
that involved in the preparation of Roget's Thesaurus 
and the author suggests that the organization of the 
thesaurus serve as a basis for the operation (425). 
The preparation of the thesaurus, i. e. the organization 
of words into notional families, requires an expert 
knowledge of the subject. Individual words selected to 
characterize information should have about equal dis- 
criminating power as far as this particular body of in- 
formation is concerned. The words- -notions --will in 
this illustrative example be the nouns and adje.ctive 
qualifiers, both of which were differentiated with spec- 
ial symbols. The notions are grouped into notional 
families, which in turn should have about equal dis- 
criminating power. 

The final thesaurus will consist of 2 parts: the 
first part is the listing in some systematic order of 
the notional families, each identified by an index sym- 
bol such as a number or key word. Each of these 
would represent a listing of the words from the sample 
documents which are related with respect to the notion 
they express. The second part would be an alphabetic 
index of the words occurring in the first part, giving 
the key word or index number of the one or several 
notional families of which the given word is a member 
(426). The number of notional families is estimated 
to be less than 1, 000 and this number will probably 
grow at a very low rate (427). 

The autoencoding of the documents will be carried 
out with the aid of a dictionary of notions stored in the 
computer 1 s memory. Each document is recorded in 
terms of notional elements and a pattern is thus cre- 
ated which will be used in searching for documents 
with related notions. A notion is assigned a special 
significance if it occurs twice in a paragraph or if it 
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occurs in succeeding paragraphs. Another way of se- 
lecting major notions major notions is the name given 
to notions which have special significance is to select 
words occurring in titles, headings, and resumes. 

The machine operation is described by Luhn: 

All encoding would be carried out by a data- 
processing machine having a direct-access in- 
formation storage and a look-up device. The 
dictionary index would be entered in one por- 
tion of the storage device and the document 
file previously prepared for making the con- 
cordance would now be processed in accord- 
ance with an encoding program. Each noun 
within a paragraph would be looked up and 
the corresponding family number or numbers 
extracted from the index storage and stored 
in a separate storage portion. The machine 
would then determine which of the words have 
a notion in common by comparing each family 
number in turn with the other numbers stored 
for nouns of the current, as well as the pre- 
ceding and succeeding paragraphs. Since 
matching family numbers are indicative of a 
major notion, they would be entered in a third 
portion of the storage device. Words which 
fail to attain the status of major notions would 
not be entered. 

Upon exhaustion of this procedure and the 
recording of associated names, the encoded 
version of a paragraph would be ready for 
transfer to a permanent storage medium such 
as a reel of magnetic tape. Family numbers 
stored in the second portion of storage would 
be retained only during the analysis of the 
next-following paragraph. As the encoding of 
the new paragraph proceeds, the family num- 
bers of its words would now also be compared 
with those of the immediately preceding and 
succeeding paragraphs to find common notions. 
This process would be repeated for each suc- 
cessive paragraph until the end of the chapter 
has been reached. The end result would be a 



Retrieval Systems 291 

mechanically prepared notional abstract (428). 

The collection of autoencoded documents would be 
searched by reversing the operation. 

The inquirer will be asked to prepare an essay 
on what he wants, why he wants it, and anything else 
that might have a bearing on the question. The query 
document will then be translated into a notional abstract. 
This is done in the same way as in the case of the pro- 
cedure used for encoding the documents in the collec- 
tion. The query notional abstract is set up as the 
question pattern. The machine is asked to match this 
notion pattern with document notion patterns and to ac- 
cept documents with a stated degree of similarity. An 
exact matching is unlikely in most cases since the an- 
swer would be an exact duplication of the question. To 
make the search more efficient, the search can be di- 
vided into 2 or more steps. The first step would be 
a rough screening for likely documents and this would 
eliminate the bulk of the file. The second step would 
be the search for the final specified conditions. If the 
search requirements have to be altered, subsequent 
searches would probably be done on a small part of 
the collection separated in the first screening operation 
instead of on the entire file of documents (429). 

In a later article Luhn suggests the preparation 

of the autoabstract of a document and the index to the 
document (the autoencoded document) in one series of 
operations (430), 

Use: The soundness and practicality of the sys- 
tem have not as yet been proven by any full scale op- 
eration. Experiment involving some 1, 200 technical 
reports have produced encouraging results (431). 



V. Summary and Conclusions 

Machine based systems for literature searching, 
notwithstanding the vocal efforts by their proponents, 
have had very little impact on libraries or information 
centers. The type of usage which is made of informa- 
tion in the library and the overall economics of the 
problem seem to be the reasons for this. A large 
portion of the requests for information in libraries, 
perhaps the vast majority, seem to be of the reference 
question type which are answered relatively efficiently 
with the existing bibliographic apparatus and which would 
not be answered more efficiently with machine based 
systems. This type of request can be exemplified by 
a list of publications by an author, TT a few good articles 
for background material on a subject, TT or the physical 
properties of a particular chemical. Reference ques- 
tions such as these can be answered rather readily by 
the existing bibliographic -apparatus which is the card 
catalog, published indices, reference books and other 
sources in that particular library or in other libraries, 
all coordinated by the reference librarian. This pre- 
dominant kind of use of information in the library 
makes it necessary to keep the existing bibliographic 
organization and to superimpose on it something which 
can handle comprehensive searches more efficiently. 
Since these represent only a minority of the requests, 
particularly if we count only the comprehensive search- 
es that cannot be handled efficiently by existing tools, 
the economics are against the installation of a machine 
based system in most libraries or information centers. 

The reasons why existing methods are not effic- 
ient for some comprehensive searches are these: 

1. The mere size of the literature in many fields 
makes the bibliographic apparatus too difficult to search. 
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2. Searches require different access to the infor- 
mation than that provided by the index. For example, 
a group of chemicals which are indexed by structural 
characteristics are difficult to search by type of use 
and vice versa. 

3. The indexing is often not consistent or is In- 
complete. It should be pointed out that machine based 
systems will only speed up part of the searching opera- 
tion because a part of the search does not lend itself 

to mechanization. This is pointed out by Shaw when he 
breaks down a search or bibliography, as he calls it, 
in the following steps: 

I. Planning the bibliography: Determining the 
scope of the search, determining the sourc- 
es to be searched, determining the headings 
under which the search will be made, es- 
timating the time required and scheduling 
the operation 

n. Searching: Consulting the sources under the 
subjects indicated and selecting possible ref- 
erences; modifying the list of sources and 
of headings. 
in. Copying the citations that appear to be pert- 

inent. 
IV. Locating copies of citations noted in sponsor- 

ing library and elsewhere. 
V. Verifying the citations for accuracy and 

completeness. 

VI* Analyzing the article to determine whether it 
is actually pertinent; preparing annotations 
or abstracts whenever this is called for. 
VEL Organizing the material. 
VUL Editing the search report. 
IX. Reproducing the search report in its final 
form (432). 



Shaw points out that Steps I, VI, VH, and 
which represent a major portion of time in a high grade 
search, cannot be mechanized (433). Searching the in- 
dex headings and copying the references by machine will 
result in the greatest savings. But scanning the index 
headings cannot be completely entrusted to the machine 
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for reasons outlined below. Consequently only part of 
the overall time for this operation will be saved and 
it is somewhat questionable whether in the average in- 
stallation the savings in search time will compensate 
for the time and money invested in the machine and in 
indexing the information for machine searching if exist- 
ing conventional indexing methods have to be maintained 
as well. 

The machine can scan a large number of index 
entries more quickly and more efficiently than a human 
searcher if the search can be so phrased and pro- 
grammed that the decision involved in accepting or re- 
jecting a reference is a mechanical one. This qualific- 
ation is emphasized because it is the fly in the oint- 
ment. No indexing systems, with the exception of an 
index to a subject which lends itself to unambiguous 
definition such as an index to chemical structure or to 
numeric data, is now at a stage where the scanning of 
entries is a complete mechanical operation. A good 
searcher does not trust the index heading alone; he 
knows that because of inconsistencies in the indexing, 
because of indexing from a different point of view than 
that of the search, because of indexing from the ab- 
stract instead of the original data, because the author 
did not spell out implied information for the indexer, 
information has to be read into the index heading and 
the abstract or additional information about the docu- 
ment has to be obtained before a decision is made to 
accept or reject the document. The danger here is not 
to select a borderline document but to reject docu- 
ments on the basis of the index heading. In view of 
these deficiencies in an index, some of which can only 
be corrected by standardizing the authors, the scanning 
of index headings is not strictly a mechanical routine. 
Machines can be programmed to include a great deal of 
this borderline material but if this is done then the dis- 
criminating power of the machine search is greatly re- 
duced and the bulk of the burden will again be placed 
on the human searcher. The literature on the use of 
machine based systems seems to substantiate these con- 
tentions. 

Of the 42 electronic machine-based information re- 
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trieval systems discussed in the literature, 13 or only 
about 1/3 are in actual operation. The size and use 
of these systems, when size and use were indicated, 
seem to be low. Nine of these installations make use 
of the IBM 101. Twenty-four of the other reported 
systems are in the experimental stage, while the re- 
maining 5 are in the n blue sky" stage. 

Although fewer electronic machine based data 
files are discussed in the literature, a higher percent- 
age of them are in actual operation. Of the 7 data 
files reported, 4 are in operation, 2 are in the exper- 
imental stage, and one is in the n blue sky" stage. 

We should not assume from the present state of 
the art that the situation will remain static. There is 
no doubt in the writer r s mind that additional work on 
indexing systems will not only improve the convention- 
al manual systems but will also provide a more solid 
foundation for machine based indexing systems. This 
has been pointed out by Lowry: 

The present outlook for development of satis- 
factory machine techniques is very encourag- 
ing. This does not mean that we will have 
our machine system next year or even in five 
years but it is coming as surely as new cures 
for old maladies inevitably come. It will be 
forced upon us by demands of science and pro- 
gress and solution may be found by those who 
need the information rather than by those who 
provide it (434). 

In conclusion, we can say that the use of electron- 
ic machines in information retrieval systems is, to put 
it conservatively, not very widespread. This seems to 
be because machine based indexing systems in their 
present state of development have to supplement rather 
than replace existing conventional indices. The econ- 
omics are against this in most situations. The picture 
looks brighter in areas where conventional indexing 
systems are not as deeply entrenched, or where the 
semantic problem is not as great. Data files seem to 
fall under this category. The use of machines to assist 
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the indexer in analyzing his vacabulary seems to be 
another promising application. Finally, the use of 
machines instead of human beings to do the intellectual 
tasks of abstracting and indexing is proposed as the ul- 
timate solution. Like most ultimate solutions, this one 
seems to be a long way off. 
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Volume Four Part Five 
CODING IN YES -NO FORM 

by 
Doralyn J. Hickey 



Mechanical Representation of Codes 
in Yes-No Form 



The simplest form of yes-no coding is achieved 
by allotting one specific "yes" position to represent one 
and only one concept (1). The polar character of the 
yes-no relationship implies that the total body of mater- 
ial to be coded can be divided into two distinct parts 
according to whether or not it is found to contain a giv- 
en concept. For instance, books checked out of a li- 
brary may be divisible into two main categories: re- 
serve books and non-reserve books; then, one specific 
code position marked "yes" will indicate that a partic- 
ular title belongs to the reserve section, while the 
same position (on another card) marked Tt no n will in- 
dicate that the second title does not belong to the re- 
serve section (2). On the other hand, a concept which 
allows for significant variation in its components must 
code into more than one position. A typical example 
of this practice is the direct code designed for record- 
ing material on the geology and chemistry of coal; in 
this case the headings of major interest are coded di- 
rectly on one section of the coding medium while sub- 
headings are given additional positions on another sec- 
tion (3). 

Although the authorities (4, 5) define direct coding 
in terms of its ability to separate materials into two 
groups- - n a TT and "not a"- -this phrase is also used to 
designate systems in which the meaning of the code po- 
sition is directly readable simply by examination of the 
code medium (6). Almost all of the examples of direct 
coding are to be found in connection with the use of the 
marginal punched card, and in the majority of cases 
the meaning of each position on the card is printed ad- 
jacent to the code position itself (7, 8, 9, 10, 11, 12, 13). 
In some cases, however, direct coding has been em- 
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ployed through the use of a printed number next to the 
code position, the meaning for the number being listed 
on a separate code index (14) or on a template (15). 

The marginal punched card operates upon a rela- 
tively simple principle: small holes are perforated 
close to the edge of the card stock, and meanings are 
assigned to these holes; if the material being indexed 
is represented by one of the holes, then this position 
is notched out to the edge of the card. When a group 
of cards is sorted, the cards are superimposed and a 
small rod or needle inserted through the desired hole; 
as the needle is lifted, cards with a notch at that hole 
will fall off and can be easily separated from the rest 
(16, 17). Figure 1 shows an example of the marginal 
punched card employing direct coding. A number of 
different brands of cards are available, but the basic 
principle of operation is the same (18). 

A modified form of direct coding may also be ap- 
plied to cards designed for machine sorting (19, 20). In 
this case the design of the code bearer is responsible 
for the modification since the cards may be punched 
over their entire surface (21). These cards are pro- 
duced in three basic patterns: the IBM card with 80 
vertical columns and 12 positions per column; the Rem- 
ington Rand card with 45 columns of 12 positions each 
(or 90 columns of 6 positions each) (22); and the Pow- 
ers-Samas card with 40 columns of 12 positions each 
(23, 24). Examples of the three main types of cards 
are shown in Figures 2, 3, and 4. 

It is not impossible to make use of straight di- 
rect coding on machine sorted cards; indeed, two writ- 
ers record situations in which the positions on an IBM 
card were regarded as unique locations and a direct 
coding scheme was introduced (25, 26). This practice, 
however, often involves the punching of more than one 
position per column, which in turn complicates the ma- 
chine sorting operation since some of the machines are 
not equipped to sort for more than one position per col- 
umn (27, 28, 29). The alternative, of course, is to make 
use of a modified direct code in which the columns ex- 
press given concepts and the positions in the columns 
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Figure 1 
Direct Code on a Marginal Punch Card 
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Figure 2 
IBM Card, Actual Size, Indicating Punching Code 
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Figure 3 

Underwood Samas Cards, Actual Size, 
Indicating Punching Code 
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Figure 4 

Remington Rand Card, Actual Size 
Indicating Punching Code 
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are used to designate mutually exclusive subheadings 
(30). In the case that the subheadings are not neces- 
sarily mutually exclusive, the problem of double punch- 
ing may arise again (31); however, solutions have been 
discovered to overcome this difficulty, e. g. rewiring 
the electrical circuits (32), inactivating certain contacts 
or feeding the cards in upside down (in the case of two 
punches per column) after they have been run in the 
normal way (33). 

It is rare to find situations in which the machine 
sorted card contains direct coding with the complete 
meaning of the codes printed on the face of the card; 
however, the significance of the columns is sometimes 
printed directly onto the card (34, 35), and one writer 
describes a card which allows both the written and the 
coded form to appear on its face (36). 

A code medium similar to both the machine sorted 
and the marginal punched cards is the slotted card (see 
Figure 5) which is designed for needle sorting, but 
which makes use of the entire face of the card rather 
than just the edge (37). The TT yes fT indication is achieved 
by slotting away the section of the card between two 
holes, the holes being numbered consecutively (38). Dir- 
ect coding is applicable to this medium as well as to 
the marginal punched cards (39). 

A fourth medium, which differs from those already 
mentioned by virtue of its form and of the use to which 
it is put, is the n Peek-a-boo H card, sometimes called 
the "Batten" card (40) and elsewhere known as the Cor- 
donnier system (41). This system uses one card to rep- 
resent each subject heading, and positions marked over 
the entire face of the card to stand for document num- 
bers; if the subject heading pertains to a given document 
the pre-assigned number for that document will have its 
position punched out on the card (42). Documents which 
deal with each of several subjects may be located by 
superimposing the subject cards and observing the num- 
bers of those positions which allow light to pass through 
(43). The number of documents which can be indexed 
per card and the size of the punch used, its placement 
on the card, etc., vary widely with the particular sys- 
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Direct Coding on a n Findex TT (Slotted Card 
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tern chosen, but the coding is essentially direct in all 
cases (44,45, 46), 



Systems Using One or More Positions 
Per Concept Represented 

Before summarizing the developments in the field 
of numerical and alphabetic coding, it may be helpful to 
outline the mathematical theory of combinations so that 
the coding capacity of the various media may be under- 
stood. No attempt will be made to prove the following 
statements; the reader is referred to items no. 47, 48, 
and 49 in the notes and also to item no. 1, p. 67-68 and 
no 50, p. 276-283. 

The basic formula for determining the number of 
possible combinations of TT n Tt different items taken TT m n 
at a time (where the order in which the items are placed 
has no effect upon the meaning of the aggregate) is ex- 
pressed by the following equation: 



n n(n-l)(n-2) 

C = 

m 



= nl i n(n-l)(n-2) ... (n-m/1) 

m!(n-m)! m! 

Provided that n and m are positive integers and 

n is greater than m. 
Also: 



/ \ n 
1 2 'C 



n n n 

C and (3) C = C 

m n-m n 



If a rearrangement of the order in which the items 
are taken changes the meaning of the aggregate, then the 
number of possible aggregates (or permutations) is de- 
termined by the following formula: 
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(4) P * C (m!) n _ s n(n-l)(n-2) ... (n-m/1) 
m m (n-m)! 



In certain situations it is necessary to know how 
many different patterns can be obtained from "n" posi- 
tions when each position can take on any one of Tt r tT 
different integral values and the positions may be taken 
any number "m n (where TT m TT is an integer, O^x 
at a time. The formula becomes: 



(5) C = r n 

In the case that r = 2, then 
m = n 

(6) c (2) . 



m = 
Numerical Coding 

A type of coding closely akin to direct coding is 
the representation of the ten decimal digits 0, 1, 2, 3, 
4, 5, 6, 7, 8, 9- -by means of ten positions on the cod- 
ing medium; from several of these groups taken togeth- 
er, the ordinary decimal numbers may be built up, let- 
ting one group stand for the units position, another for 
the tens, a third for the hundreds, and so on (51). Al- 
though this procedure is applicable to marginal punched 
cards (52), the machine sorted cards use it more often 
and in a slightly different form: columns are used to 
represent the units, tens, hundreds, etc. , positions in 
the decimal number, and the value of the digit is indi- 
cated by punching the desired row in the proper column 
(53). In some cases the numbers punched into the cards 
in this manner are the actual quantities which are being 
recorded (54); in other cases the numbers are them- 
selves a code standing for certain subject material (55). 

A related means for representing the ten decimal 
digits has been employed by telephone systems: each 
digit is transformed into a certain number of holes 
punched into a tape, the number of holes being equiva- 
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lent to the value of the digit (56). 

A very common way to represent decimal digits 
through the use of less than ten positions is the "weight- 
ed" code, which has "the property that values, or 
weights, can be assigned to each of the . . . bits [posi- 
tions] with the decimal digit being represented equal to 
the sum of the weights" (57). Many forms of this code 
are available (58), but the 7-4-2-1 code is by far the 
most popular (59, 60, 61, 62, 63, 64). By means of this 
code, the numbers 1 through 14 may be represented, 
the number 6 being coded as a punch in the "4" position 
and a punch in the "2" position, for example (65). This 
code further has the property of facilitating sequential 
order, since a group of marginal punched cards may be 
sorted into numerical order simply by passing the needle 
through each of the four holes in sequence, each time 
placing at the end of the pack (in order) the cards which 
drop out (66, 67). Decimal numbers larger than 9 may 
be coded in either one of two ways: a second field of 
four positions (representing 70, 40, 20, and 10) may be 
added (with more fields if necessary, as the size of the 
decimal number warrants) (68), or a greater coding 
capacity may be secured by allowing each field to rep- 
resent the full 14 numbers so that the "1" position in 
the second field would represent the number 15, etc. 
(69). 

The application of the 7-4-2-1 code to machine 
sorted cards is relatively infrequent, although it has 
been suggested that 8 of the 12 positions per column 
could be used with this code for the representation of 
Dysonian chemical notation (70). A suggestion by Dunn 
(71) (which will be mentioned again later) has been 
transformed into the 7-4-2-1 pattern for machine sorted 
cards by Reagh (72), who recommends the 7-4-2-1 code 
as simpler, although it reduces the coding capacity of a 
column from Dunn's figure to 1, 500. 

The 7-4-2-1 combination is by no means the only 
available 4-place code; Richards (73) lists 17 weighted 
4-bit (i. e. , 4-position) codes, some of which have the 
additional property of being "self-complementing" (74). 
This latter factor is particularly useful with regard to 
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digital calculating machines since it means that the 9 T s 
complement (the decimal number which, when added 
digit by digit to a given number will yield a total of 9 
in each decimal position) of a digit is obtained by chang- 
ing the state of each of the switches used to represent 
the digit. As Richards puts it, "The 9 T s complement of 
each decimal digit may be obtained by changing the l T s 
to T s and the T s to l T s in the coded representation of 
the digit 1 T (75). 

A 4- position code which behaves in almost the 
same manner as the more conventional 7-4-2-1 code is 
the 8-4-2-1 system (76). This is actually a 4-place de- 
velopment of the straight binary representation (77) or 
geometrical progression (78). It is not, however, a 
self-complementing code, although it can be transformed 
into one by the following process (79): TT In this [excess- 
3] code, 3 is added to each decimal digit to give an ex- 
cess- 3 value which is then represented by a correspond- 
ing four-place binary number tT (80). Hartree cites 
Stibitz as the originator of the excess- 3 code and indi- 
cates that it has three advantages over straight binary 
representation: 

It gives a positive indication of the digit zero, 
complements are obtained by interchanging Ols and l f s, 
and in the process of addition carry-over from the most 
significant of the four binary digits occurs just when, 
and only when, carry-over from the corresponding dec- 
imal place is required (81). 

The 8-4-2-1 code has also been recommended as 
a means for increasing the capacity of the columns on 
machine sorted cards; Royer (82) suggests the division 
of the 12 positions in each column into three groups of 
4 each, assigning the 8-4-2-1 code to each group. 
Eckert (83) records the use of the 8-4-2-1 code in the 
IBM Selective Sequence Electronic Calculator. 

Codes using more than four positions for the rep- 
resentation of a single decimal digit have been devel- 
oped to simplify certain special operations involving 
marginal punched cards (84) and computing machinery 
(85). A common scheme which facilitates the selection 
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of individual marginal punched cards without sacrificing 
the easy ordering feature is the SF-7-4-2-1-0 system 
(86). The "SF" position indicates that the other punch 
in the field stands for a single figure; this position is 
not punched when the decimal digit represented is 3, 5, 
6, 8, or 9 (87). The use of the "0" in the code be- 
comes essential only when more than one field is em- 
ployed; its function is to separate, for example, the 
70' s from the 71' s, 72' s, 73' s, etc. (88). The essen- 
tial characteristic of a selector code, however, is that 
the same number of yes-designations be placed in the 
field for each digit represented; and since five positions 
taken two at a time yield exactly ten distinct meanings, 
the 5-bit selector code is widely used (89, 90). In the 
case of the marginal punched card, a 5-position selector 
code provides for the isolation of an individual digit by 
two simultaneous needlings (91). Of course many other 
selector codes are possible; a table of the various com- 
binations and their capacities is included by Wise (92). 

Selector codes on marginal punched cards are 
sometimes equipped for direct reading by the use of a 
triangular arrangement; the basic principle of the code, 
however, remains the same (93). For a diagram of tri- 
angular selector coding, see Figure 6. The meaning of 
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5- Position Triangular Code for a Marginal Punch Card 
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any two punches may be determined by reading the num- 
ber printed in the triangle as the intersection of the di- 
agonal columns marked off by the punches . 

Five-bit selector codes also offer advantages for 
use in electronic calculators: "With 5 bits there are 
ten different combinations with two l f s, and when the 
ten decimal digits are represented in this manner, it is 
possible to distinguish any digit without the use of in- 
verters, which is an advantage" (94). The IBM Magnet- 
ic Drum Calculator type 650 employs, is one part of its 
operation, a selector code of the form 6-3-2-1-0 (95), 
although developers of one of the Harvard calculators 
have found a 5-bit weighted 8-6-4-2-1 code to be use- 
ful (96). The Rapid Selector, which records its code 
by means of opaque and transparent spots on film, em- 
ploys a 5-position selector code with an overlapping ar- 
rangement which will record decimal numbers to a mag- 
nitude of seven places (97). A diagram of the code is 
shown in Figure 7 




Figure 7 



Synchronization 
mark 



Arrangement of the 7-field 5-position Rapid Selec- 
tor Code. Heavy lines indicate the division of the fields; 
numbers in parenthesis show the location of the positions 
in the two types of fields. The synchronization mark 
positions the rows properly for scanning. 

The use of codes of five or more bits to represent 
decimal digits for purposes of machine calculations has 
been suggested for two reasons: "[the use of them may 
make it] possible to effect simplifications in the arith- 
metic circuits in some cases . . . [and they can provide] 
the ability to detect errors" (98,99). Krider reports 
the use of a 6-position code for automatic programming 
of certain small calculators (100), and Richards goes in- 
to some detail about error-detecting and error-correct- 
ing codes which may use seven or eight bits to repre- 
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sent the ten decimal digits (101). Such error-detecting 
and correcting codes are subject to the following con- 
ditions (102): a single- error-detecting code requires 
the use of five bits; single- error-correcting, seven bits; 
double- error-detecting, eight bits. Another way of 
stating the conditions is that the detecting and correct- 
ing powers increase with the number of changes re- 
quired to transform the representation of one digit into 
the representation of another; if only one change is re- 
quired, then no err or -detecting or error- correcting 
properties are available. A code having the desired 
detecting and correcting qualities may be constructed 
through the use of "redundancy bits": extra positions 
used to indicate whether or not the quantity of l r s in a 
certain section of the coded digit is even or odd; if 
there are an even number of Ts, then the redundancy 
bit is a 1, but if there are an odd number of I's, the 
redundancy bit is a (103). 

Up to this point we have been discussing the var- 
ious means by which decimal digits (0 through 9) may 
be converted into code patterns using only yes-no desig- 
nations. The fact that "almost all high-speed comput- 
ers are based on two-state storage systems" (104) has 
suggested that "A simpler method of using two- state el- 
ements is to work in the scale of two, or the r binary T 
scale, as it is often called" (105). Livesley gives the 
following comparative table of numerical values in the 
decimal and binary forms: 

Decimal Form of Number 

6 ( . 6x10) 
18 ( . IxlOVSxlO ) 
273 ( = 2xl0 2 /7xlO V3xlO) 
3.25 ( . 3xl0 /2xlO-V5xlO- 2 ) 

Binary Form 



110 ( 

10010 ( = Ix2 4 /lx2 1 ) 

100010001 ( = Ix2 8 /lx2 4 /lx2l 

11.01 ( = lx2Vlx2/lx2- 2 ) 
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Richards points out that the use of both binary and 
decimal systems for computing machinery is very com- 
mon: 

So far as is known, radices two, three, eight, 
twelve, and sixteen are the only ones which 
have ever received serious consideration for 
use in computing machinery. The list of those 
which have actually been used is much more 
restricted. In fact, no computer is known in 
which a radix other than two or ten is employed. 
One minor exception to this last statement ex- 
ists in that at least two companies have built 
small electromechanical desk computers in 
which radix eight is used (107). 

He points out, however, that TT printed numbers in 
the binary system are undesirable because it is difficult 
to handle a large number of nothing but T s and l T s with- 
out making excessive errors" (108). 

Upon occasion, a binary system has been recom- 
mended for use with machine sorted cards; thus, the 
twelve positions in each column could be used in binary 
fashion to represent decimal numbers from 1 to 4, 095 
(109). This procedure, however, reintroduces the prob- 
lems of multiple punches per column which were dis- 
cussed earlier. The use of straight binary, and even 
ternary, coding on marginal punched cards has been 
suggested by Giffler (110), but there is no evidence that 
such schemes have been put into common practice. 

There are two other radix systems which have 
certain features to commend them. The octonary sys- 
tem has been suggested for the following reasons: 

The expression of a number in binary notation 
requires a comparatively large number of dig- 
its For example, a number with six dec- 
imal digits may require 20 binary digits; the 
same number, in octal notation, may need only 
seven. The number of octal digits is thus only 
slightly greater than the number of decimal dig- 
its, the conversion from binary representation 
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is trivially easy [see table below], and an in- 
experienced reader can get good qualitative es- 
timates of the results without worrying about 
the octal character of the representation (111). 

Binary Representation of Octal Digits 

Binary triplet Octal mark 

000 

001 1 

010 2 

011 3 

100 4 

101 5 

110 6 

111 7 

A system of mixed radices- -the biquinary system 
--has been employed by the Bell Telephone Laboratories 
in some of their computers (see reference no. 80, p. 76). 
Richards asserts: "The biquinary code has 7 bits with 
the weights of 5, 0, 4, 3, 2, 1, 0. With this code, 
arithmetic operations may be performed in a moderately 
straightforward manner, although whether or not there 
is a net simplification when compared with the 4-bit 
codes is a debatable point. The main reason for the 
use of 7 bits is the ability to detect errors" (112). 
Hamilton records the use of this code in one portion of 
the IBM Magnetic Drum Calculator (type 650) (113). 
Richards also mentions a quibinary code with weights 
8, 6, 4, 2, 0, 1, 0, but he does not cite any computer 
in which this code has been used (114). 

The desire to gain coding capacity on marginal 
punched cards has led to experiments with systems us- 
ing more than one row of holes (115). McGaw reports 
the use of a 4-position, 2-row field which provides for 
"the selective sorting of nine classifications" (116). 
Ruston discusses the same system in terms of triangular 
coding, with each intersection representing two digits: 
if the upper digit is to be indicated, the left hand col- 
umn is deep-punched and the right hand column is shal- 
low-punched, while the reverse arrangement indicates 
the lower digit (see Figure 8) (117). Draheim goes into 
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great detail about the various coding possibilities of ten 
positions in double and triple row combinations (118); 
however, the use of double and triple rows has not been 
standardized to any great degree as yet (119). One 
fairly common use seems to be the assignment of the 
deep punches to a broad subject and of the shallow 
punches to subheadings (120). The 3-row system may 
also employ a third type of punch- -much like the slot- 
ting process described earlier to increase the capacity 
and selectivity features of a marginal punched card (121), 
although the sorting procedure may be considerably com- 
plicated thereby (122). 
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Figure 8 

Triangular 2-row, 4-position Coding 
on a Marginal Punch Card 



Coding the Alphabet 

The representations of alphabetic material in yes- 
no form have been designed with two main objects in 
mind: the coding of individual letters as separate enti- 
ties and the coding of letters so as to form words (123, 
124). Although marginal punched cards have been de- 
signed which simply allot 26 spaces for the letters of 
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the alphabet (125), other more economical systems have 
been developed. Among these is one which has re- 
ceived considerable attention: the OLECB plan devel- 
oped by Gerald Cox (126, 127, 128). This code has been 
issued in two versions, the revised plan appearing be- 
low (129,130): 

Letter Code_ Letter Code Letter Code 

A No punch M* IE U OI 

B B MAC IEB V OIB 

C C M** IEC W QIC 

D CB N IECB X OICB 

E E O O Y OIE 

F EB P OB Z OIEB 

G EC Q OC 

H ECB R OCB *Used for the initial 

I I S* OE letter of words which al- 

J IB SCH OEB phabetically precede the 

K 1C S** OEC next group of letters in 

L ICB T OECB the system. 

**Used for the initial 
letter of words which al- 
phabetically follow the 
preceding group of letters 
in the system. 

The extra letter combinations have been included 
so as to facilitate the alphabetical arrangement of index 
words, and sequence sorting proceeds along the same 
lines as that for the 7-4-2-1 numerical code (131). 

Another means of alphabetical arrangement actual- 
ly uses the 7-4-2-1 system, plus an additional position 
marked tT M T (132); the code is shown below: 

Letter Code 

A M* 1 

B N 2 *A11 letters in the sec- 

C O 1 & 2 ond column also receive 

DP 4 a punch in a position 

E Q 4 & 1 marked "M". 

F R 4 & 2 
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Letter Code 

G S 4 & 2 & 1 

H T 7 

I U 7 & 1 

J V 7 & 2 

K W 7 & 4 

L X ? Y, Z 7 & 2 & 1 

A third arrangement, which does not require the 
grouping of x ? y, z, is the NZ- 7-4- 2-1 code; this code 
makes use of all combinations of 7-4-2-1 which total 
numbers 1 through 13, and these combinations are in 
turn allotted to letters A through M and to N through 
Z (with the additional NZ punch being used for the sec- 
ond group (133). The Dysonian system for representing 
letters by combinations of 1, 2, 4, 7, 10, 20, 40, 70, 
and a position marked L, on a single IBM card column, 
is very similar to the NZ-7-4-2-1 system, except that 
Dyson assigns 26 rather than 13 numbers to the letters 
and uses the !T L tT punch to distinguish letters from num- 
bers (134). 

A system which does not allow for sequence sort- 
ing is the 15 -hole code reported by Anderson (135); this 
code allots one hole to each vowel and one hole to every 
two consonants, with x, y, z, again being grouped to- 
gether. Westbrook and DeWald also suggest the assign- 
ment of two letters per hole- -for the purpose of coding 
the initial letters of author's names (136). 

A 9-position alphabetic selector code, made up of 
letters A, B, D, F, G, K, P, S > and V, has been re- 
ported by Wise as a system for representing the initial 
letter of an author 1 s name. The code is as follows 
(137): 

Letter Holes Punched Letter Holes Punched 

A ASF N KD 

B BSF O KG 

C BA P PSF 

D DSF Q PA 

E DA R PB 
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Letter Holes Punched Letter Holes Punched 

F DB S PD 

G GSF T PG 

H GA U PK 

I GB V VSF 

J GD W VA 

K KSF X VB 

L KA Y VD 

M KB Z VG 

The use of double and triple rows on marginal 
punched cards for alphabetical coding has been suggest- 
ed or reported by certain writers. Draheim describes 
a triple-row, 9-position arrangement, incorporating 
TT sch TT as his twenty- seventh letter (138). Campbell sug- 
gests a double- row, 3 -position code for representing 
the first letters of authors 1 last names, observing that 
in his field of general organic research a sorting of the 
cards by author is seldom done, and thus some of the 
less commonly used letters may be combined into a 
single representation (139), 

The triangular alphabetic code is discussed by 
Huston (140), Hood (141), and Cox (142). The princi- 
ples of this type of code are exactly the same as those 
for numerical triangular coding; a 2-row, 6-position 
version of the code is shown in Figure 9 (143). 

Alphabetic representation on machine sorted cards 
has been, to a large degree, standardized, although 
each manufacturer presents a slightly different version 
(144). The form taken by these codes is indicated in 
Figures 2, 3, and 4. 

IBM and Samas use a constant number of punches 
--two each- -for the letters, while Rand uses both two 
and three punch combinations (145). Cochran (146), 
Gull (147), and Nalbandjan (148) all mention the use of 
the IBM alphabetic system; but Moffit is concerned a- 
bout the amount of space required to code words in this 
manner (149), since only one letter can be coded into 
and interpreted from each column on the machine sorted 
card. 
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Figure 9 

Triangular 2-row, 6-position Alphabetic Code on 
a Marginal Punch Card (Subscripts indicate the 
division of a letter into two coding parts to 
facilitate an alphabetic ordering. ) 



A different code system designed for use with ma- 
chine sorted cards was suggested by Samain (150), who 
zoned the card into 6- column groups (omitting the first 
column). Each group thus contained twelve rows of six 
positions each By allowing three perforations to each 
6-position row, he could code any one of 20 different 
combinations into a single row in one group. Assuming 
that the alphabet could be reduced to 20 letters, the ca- 
pacity of an 80-column, 12-row card would become 24 
6-letter ideas. 

The development of telegraphy has occasioned an- 
other type of alphabetic code, commonly known as 
Morse code (151). Although the code is essentially 
binary, the so-called "dash" being merely an extended 
"dot, " the signal duration can be varied in different 



Retrieval Systems 343 

ways, making the system ternary, quinary, or what have 
you, in its behavior (152). The use of time as a vari- 
able in the representation of decimal digits by means 
of binary equipment is also noted by Richards in con- 
nection with computing machinery (153). The American 
and International versions of the Morse code are shown 
below, together with the 2- channel tape version used for 
cable transmission. 

American International American International 
~ Land lines Cable Land lines Cable 



. . --- 

p _ _ 

Q .--. -- 

R . .. .-. 

S ...... 

T (3) - 

U .. - ..- 

* """" * * * ' 

w . . 

-X . *""** """" *""""" 

JL * * """ """" "* 

_ _ 
. # 



The 5- channel punched tape code used by the tele- 
type is sometimes called the Baudot code and is as fol- 
lows: 
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ABCDEFGHIJKLMNOPQRSTUVWXYZ 



1 . 

2 . 



3 

4 
5 



(Line indicates position of feed holes) 



Some of the common language machines have been 
designed to operate from 5-channel tape because of the 
equipment already in production for translating alphabet- 
ical and numerical data into that form, even though 6, 
7, and 8- channel tapes are also available (154). 

An alphabetic code based on the binary system has 
been adopted as a standard means of producing books 
for the blind (155). The Braille alphabet is composed 
of groups of six embossed dots arranged in two columns 
of three dots each (see Figure 10). In advanced Braille, 
the dots may be used to indicate combinations of letters 
rather than individual ones, but the system is essential- 
ly the same (156). The number of combinations avail- 
able in this system is 63 (157,158). 

Because of the time required by the operation of 
converting written or typed data into a form suitable for 
computer input, means have been developed to by-pass 
this extra step under certain limited conditions. A pho- 
toelectric scanning device is now capable of transform- 
ing printed characters directly into binary form since 
each character is made up of a distinctive pattern or 
horizontal and vertical marks. Whenever the scanner 
detects a black mark on the white background, it pro- 
duces an electrical impulse; and the complete impulse 
pattern becomes the binary representation of the char- 
acter (159, 160). The reverse of this procedure may 
also be accomplished: binary machine code may be re- 
converted into alphabetic symbols through the use of a 
matrix of wires. Bursts of current are sent through 
the matrix at various points which in turn deposit charg- 
es on a paper in contact with the matrix; ink is attract- 
ed to the charged points on the paper, producing the 
form of the alphabetic character (161). Another sys- 
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BRAILLE ALPHABET. 



; A BCDI-fCHIJ 

ilt , ft ft-~ ftft 00 ftft ftft 0- -0 -ft 

line 1 L. JP -0 -0 0- . 00 0ft 0- 00 



|E L M NO F Q E S T 
Xr.4 J a- ft- 0ft ftft ft- ftft 00 0- -ft ~* 



""l^ fc i= i? i? t: U 



ft- 



ii 



V X Y 1 aod for <rf t&e wit>\ 

M J ft. ft- 00 00 0- 00 0ft *- -ft -ft 

ime .;;: s: s ss n :: 



ch tit *b th wh 0d r <HI w 

H 9 S % 



t . ' ' ( ) u 

I ea to COD dls n ^ 

,?V bb cc dd tf tg 



IW8 " 



Tlh , 
UHB 1 



ftft ftft ft- - 

-0 0~ 00 ft 



Fraction-line Numeral Poetry Apostr^b* 

sign siga s^n 

5 ? ^j -5 - 

ft- * - 

Itiiic or 
Accent 

- - 
-ft , -ft 



Square Brackets IWH* Invert^ 



Figure 10 
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tern converts electrical impulses into readable charact- 
ers on a telescreen (162). 



Superimposed Coding 

The numeric and alphabetic codes discussed thus 
far do not, in some situations, produce enough possible 
combinations to cover all of the categories which may 
need to be coded. To solve this problem, a type of 
coding called Tr superimposed n has been developed (163). 
Two major forms of such coding have been suggested: 
an alphabetic system called "word coding" (164) and a 
random number system called "Zatocoding" (165). The 
basic principle of both systems is, however, the same, 
for they each use codes made up of a given number 
(or numbers) or characters and punch (or otherwise 
record) the characters representing several different 
items into the same field. When this is done, over- 
punching may occur since the codes for any two items 
can have certain characters in common. Once a char- 
acter (i. e. , a position) is punched, a second punch can- 
not be detected; hence, the total number of punches in 
the field may be less than the sum of the characters 
used to represent the items (166, 167). 

The overlapping of codes can result in an inability 
;o make an absolute selection of cards which deal only 
#ith the item wanted; in other words, the patterns of 
several items may overlap enough to produce the pattern 
)f another item with which the card has no connection. 
Arhen the cards are sorted for this latter item, so- 
called "extra cards" may appear (168). Because an ex- 
cessive number of unwanted cards would impair the val- 
le of the system, both Wise and Mooers have worked 
mt mathematical theories which are intended to keep 
he unwanted cards to a minimum (169, 170). For yar- 
ous discussions of the mathematical theory involved in 
hese systems, the reader is referred to the following 
lources: 50, 163, 169, 171. 

The conclusions reached by Mooers and by Wise 
lay be summarized as follows: 
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Mooers-- T The sum of the separate punches of the indi- 
vidual code patterns placed in the coding field of a card 
shall not exceed 69% of the total number of positions in 
the field, in which case the average density of punched 
positions in the field will not exceed 50%. This is the 
condition indicating optimum utilization. . . . With a se- 
lection according to S positions, the ratio of the num- 
ber of extra cards to the total number of cards sorted 
will be less than (l/2) s on the average" (172). 

Wise (with reference to word coding)--"We may say that 
for practical punching schemes involving a maximum of 
entries punched on the card the optimum conditions are 
attained when the number of instructions to punch in a 
given field amounts to approximately 46 per cent of the 
available punching positions. Because of overlapping, 
such a proportion of punching instructions will result in 
approximately 37 per cent (1/e) of the positions being 
actually punched" (173). 

Wise (with reference to random number codes, where X is 
the total number of holes punched and H is the total 
number of holes available on the medium)-- - TT It should be 
noted that the minimum optimum value of X/H is 
0. 500000 or (1/2) instead of 0. 367879 (1/e), . . . Both 
systems of punching have the value of X/H approaching 
a maximum optimum of 0. 693147 as the number of sub- 
jects increases" (174). 

The 69% appears as a common value determined 
by both Wise and Mooers, A further selection from 
Mooers' writings will serve to show a practical applica- 
tion of his conclusions: 

According to the method of Zatocoding we will 
first compute the vocabulary V, and then the number of 
combinations T. In a field of F positions, in which N 
punches per code pattern are placed without restriction 
upon their position, the number of different patterns, 
and thus the possible coding vocabulary, is F c ^ or the 
number of combinations of F things taken N at a time. 
Thus the coding vocabulary V in a 40 position field with 
a four-punch code is 91, 390 different codes. Other code 
lengths may of course be used. The capacity of this 
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field of 40 positions is obtained by finding 69% of 40, 
which is 27/6, or nearly 28. Therefore, the field can 
contain seven four-punch codes (because four times sev- 
en equals twenty- eight). By the Zatocoding statistics 
we know that on the average 20 positions on the field 
will be punched, due to expected overlapping, when 
these 28 code punches are placed on the field. Seven 
codes out of a vocabulary of 91, 390 different ones may 
be placed in the field. Then the number of combina- 
tions T possible on the field of a card is the number of 
combinations of 91, 390 things taken seven at a time, or 
something in the order of (10)31 (175). 

A certain amount of disputation has developed be- 
tween Mooers and Wise upon the relative merits of 
their respective systems (176, 177, 178). Mooers rejects 
the alphabetical system by Wise because it lacks ran- 
domness (179) and further charges that Wise ? s applica- 
tion of superimposed coding to the Rapid Selector is 
mathematically in error (180, 181). In a later comment, 
Wise suggests that the alphabet might be given a ran- 
dom distribution by increasing it to 30 positions and ap- 
plying the author analysis used by Cox, Casey, and 
Bailey in setting up their alphabetical code (182). With 
regard to the Rapid Selector, Wise would use a 6-letter 
code and up to 16 subjects per item, with a search di- 
rected to a single code yielding extra cards to a ratio 
of less than 1 to 100; a search on two codes, less than 
1 to 10, 000; a search on three codes, less than 1 to 
1, 000, 000 (183). Under Mooers 1 calculations, the most 
efficient code would use a random pattern of 8 positions, 
with 18 patterns (or subjects) allowable per field; a se- 
lection on the basis of one pattern would yield extra 
cards in a ratio of less than 1 to 500; on two patterns, 
less than 1 to 50, 000; on three patterns, less than 1 to 
5,000,000 (184). 

Two processes for selecting random numbers suit- 
able for use in superimposed codes have been described 
(185, 186)^ one of which makes use of the Rand Corpora- 
tion's publication of 1,000,000 random digits (187) a The 
Brown and Oneal paper applies the system to IBM cards, 
rather than to the marginal punched cards used by 
Mooers (189) and Wise (190). 



Retrieval Systems 349 

Both Mooers (191) and Wise (192) claim that su- 
perimposed systems are highly successful when con- 
structed according to the mathematical specifications. 
Several installations based on the principles set forth 
by Mooers, some of which also use IBM rather than 
marginal punched cards, are reportedly functioning 
properly and satisfactorily (193, 194, 195). Support for 
the word coding system is furnished by articles from 
Sebring and from Perry (196, 197). 



References 

1. McGaw, Howard F. Marginal punched cards in col- 

lege and research libraries; Washington, D. C. , 
Scarecrow press, 1952, p. 44. 

2. Reference 1, p. 45. 

3. Breger, Irving A. Applications of simple coding 

procedures to a specific problem. In: Casey, 
Robert S. and James W. Perry, Punched cards; 
New York, Reinhold, 1951, p. 30-35. 

4. Reference 1, p. 44. 

5. Casey, Robert S., C. F. Bailey, and Gerald J. 

Cox. Punch card techniques and applications. 
Journal of chemical education, 23:495-99 (1946). 

6. Reference 3, p. 34. 

7. Clarke, S. H. A multiple- entry perforated- card 

key with special reference to the identification of 
hardwoods. New phytologist, 37:369-74 (1938). 

8. Dadswell, H. E. and Audrey M. Eckersley. The 

card sorting method applied to the identification 
of the commercial timbers of the genus Eucalyp- 
tus. Council for scientific and industrial re- 
search (Australia), Journal, 14 (no.4):266-80 
(1941). 

9. Dunkley, H. L. A multiple- entry perforated card- 

key for the identification of Uganda trees. Em- 
pire forestry journal, 18:83-90 (1939). 

10. Epstein, Albert. Statistical analysis with hand 

punched and sorted cards. American statistic- 
ian, 1 (no.2):6-7 (Oct. 1947). 

11. Hurlbut, C. S. , Jr. Mineral determination by 

means of punched cards. American mineralogist, 
33:508-12 (1948). 

12f Making alloy information accessible. Metal indus- 
try (London), 54:617-8 (Jan.-Je. 1939). 

13. Normand, D. Les cles pour T identification des 

bois et le systeme des fiches performs. L'ag- 
ronomie tropicale, 1:162-72 (1946). 

351 



352 State of the Library Art 

14. Orton, Floyd E. Let T s look at paper work. Library 

journal, 75:368, 384-87 (1950). 

15. Diakonofi, A. Het paramount-kaartsysteem ten 

dienste van de entomologie. Entomologische 
mededeelingen van Nederlandsch-Indie', 7 (no, 2): 
34-40 (1941). 

16. Reference 1, p. 15. 

17. Casey, Robert S. and James W. Perry. Introduc- 

tion, in their Punched cards; New York, Rein- 
hold, 1951, p. 5. 

18. Casey, Robert S. and James W. Perry. Element- 

ary manipulations of hand- sorted punched cards, 
in their Punched cards; New York, Reinhold, 
1951, p. 10. 

19. Black-Schaffer, Bernard and Paul D. Rosahn. Meth- 

ods of analysis of Yale autopsy protocols, includ- 
ing a code for the punched card study of syphilis. 
Yale journal of biology and medicine, 15:575-86 
(1942/43). 

20. Eckert, Wallace John. Punched card methods in 

scientific computation; New York, Thomas J. 
Watson astronomical computing bureau, Columbia 
University, 1940. 

21. Reference 17, p. 5. 

22. Lawson, Murray G. The machine age in historical 

research. American archivist, 11:141-49 (1948). 

23. Friedman, Burton Dean. Punched card primer; 

Public administration service, 1955, p. 8-13. 

24. Power s-Samas cards are also available, according 

to Friedman (see reference 23) in the 21 column 
size, while Callander (Punched card systems; 
their application to library technique. Library 
association record, 48:171-4((1946)) ) describes 
a card, issued by the same company, which has 
36 columns of 10 positions each. 

25. Gage, N. L. and H. M. Remmers. Opinion polling 

with mark-sensed punch cards. Journal of ap- 
plied psychology, 32:88-91 (1948). 

26. Taylor, F. Lowell. Numerical index key for the 

Beilstein system. Industrial and engineering 
chemistry, 40:470-73 (1948). 

27. Edwards, Thomas I. The coding and tabulation of 

medical and research data for statistical analysis. 
Public health report (U.S. Public Health Service) 



Retrieval Systems 353 

57 (pt. 1):7-20 (Jan. -Je. 1942). 

28. Kempthorne, O. The use of a punched- card sys- 

tem for the analysis of survey data, with spec- 
ial reference to the analysis of the National farm 
survey. Royal Statistical Society, Journal, 109: 
284-95 (1946). 

29. Peakes, Gilbert L. Report indexing by punched 

cards.. Journal of chemical education, 26:139- 
46 (1949). 

30. Reference 19. 

31. Roches, Camille. La mecanographie au service de 

la documentation. L'industrie textile, 63:107-8 
(1946). 

32. Appel, Valentine and George Cooper. A refine- 

ment in the use of mark- sense cards for test 
research. American statistical association, 
Journal, 50:557-60 (Je. 1955). 

33. Peakes, Gilbert L. Report indexing by machine- 

sorted punched cards. In: Casey, Robert S. and 
James W. Perry, Pitched cards; New York, 
Reinhold, 1951, p. 119. 

34. Fletcher, Mona. The use of mechanical equipment 

in legislative research. American Academy of 
Political and Social Science, Annals, 195:168-75 
(1938). 

35. U.S. Department of Commerce, Bureau of the Cen- 

sus, United States census of agriculture, 1935, 
Descriptive supplement: technique of tabulation; 
Washington, D. C., U. S. Government printing 
office, 1937. 

36. Berkson, Joseph. A punch card designed to con- 

tain written data and coding. American statist- 
ical association, Journal, 36:535-38 (1941). 
37o Reference 1, p. 198-201. 

38. Reference 1, p. 200. 

39. National Research Council, Research information 

service, Mechanical aids for the classification 
of American investigators, with illustrations in 
the field of psychology, by Harold C. Bingham; 
Washington, D. C., National Research Council, 
1922. 

40. Batten, W. E. Specialized files for patent search- 

ing. In:. Casey, Robert S. and James W. Perry, 
Punched cards; New York, Reinhold, 1951, p. 171. 



354 State of the Library Art 

41. Schurmeyer, Walther. Selecto - ein neues auswahl- 

system fur die dokumentation. Nachrichten fur 
dokumentation, 3:33-34 (1952). 

42. Batten, W. E. A punched-card system of indexing 

to meet special requirements. Association of 
special libraries and information bureaux, Re- 
of the proceedings, 22:37-39 (1947). 

43. Reference 42. 

44. Reference 41. 

45. Reference 42. 

46. Loosjes, Th. P. The Delta card. Aslib proceed- 

ings, 9 (no. 5):142-43 (May 1957). 

47. Whitworth, William Allen. Choice and chance; New 

York, Stechert, 1934, p. 1-67. 

48. Burington, Richard Stevens and Donald Curtis May, 

Jr. Handbook of probability and statistics with 
tables; Sandusky, O. , Handbook publishers, p. 
25-27. 

49. Briggs, William and G. H. Bryan. The tutorial 

algebra, rewritten by George Walker; London, 
University tutorial press, 1954, 1: 327-370. 

50. Wise, Carl S. Mathematical analysis of coding 

systems. In: Casey, Robert S. and James W. 
Perry, Punched cards; New York, Reinhold, 
1951. 

51. Draheim, Heinz and Ottokar Gdaniec. Was leisten 

zehn locher einer randlochkarte? Ein iibersicht- 
sreferat uber verschiedene praktische verschlus- 
selungsmethoden. Nachrichten fur dokumentation, 
5:201-10 (1954). 

52. Reference 51. 

53. Washburn, Earle L. New York University. In: 

Baehne, G. W. Practical applications of the 
punched card method in colleges and universities; 
New York, Columbia University press, 1935, p. 
135-44. 

54. Reference 51. 

55. National roster of scientific and specialized person- 

nel, Report ... to the National resources plan- 
ning board; Washington, D. C. , U. S. Govern- 
ment printing office, 1942, p. 15-20. 

56. Blashfield, W. H. Subscriber toll dialing tape 

reader. Electrical engineering, 72:786 (1953). 

57. Richards, R. K. Arithmetic operations in digital 



Retrieval Systems 355 

computers; Princeton, N. J. , Van Nostrand, 
1955, p. 178. 

58. Reference 57, p. 178. 

59. Reference 1, p. 48. 

60. Reference 50, p. 283. 

61. Brown, George B. Use of punched cards in ac- 

quisition work: experience at Illinois. College 
and research libraries, 10:219-20, 257 (1949). 

62. Murphy, George M. An isotope file on punched 

cards. Journal of chemical education, 24:556-7 
(1947). 

63. Tagge, John. Hand- sort punch cards deliver main- 

tenance data. Factory management and mainte- 
nance, 113 (no. 2):142-44 (Feb. 1955). 

64. Young, George G. Borrower merely signs his 

name. Library journal, 74:12-16, p. 78(1949). 

65. Reference 1, p. 48-49. 

66. Reference 50, p. 282. 

67. Reference 1, p. 49. 

68. Reference 1, p 50-51. 

69. Reference 1, p. 51-52. 

70. Dyson, G. Malcolm. Some applications of the 

Dysonian notation of organic compounds. Journal 
of chemical education, 26:294-303 (1949). 

71. Dunn, H. L. Adaptation of new geometric code to 

multiple punching in mechanical tabulation. 
American statistical association, Journal, 27:279 
-86 (1932). 

72. Reagh, Russell R. A simplified code for multiple 

card punching. American statistical association, 
Journal, 29:182-83 (1934). 

73. Reference 57, p. 178. 

74. Reference 57, p. 179. 

75. Reference 57, p. 179. 

76. Reference 57, p. 179. 

77. Reference 57, p. 179. 

78. Reference 1, p. 63. 

79. Reference 57, p. 179-180. 

80. Engineering research associates, High-speed com- 

puting devices; New York, McGraw-Hill, 1950, 
p. 289. 

81. Hartree, Douglas R. Calculating instruments and 

machines; Urbana, University of Illinois press, 
1949, p. 59-60. 



356 State of the Library Art 

82. Royer, Elmer B. and Herbert A. Toops. The 

statistics of geometrically coded scores, Amer- 
ican statistical association, Journal, 28:192-98 
(1933). 

83 Eckert, W. J. Electrons and computation. Sci- 
entific monthly, 67:315-23 (Jly. -Dec, 1948). 

84. Reference 1, p. 53-54. 

85. Reference 57, p. 184-192. 

86. Reference 1, p. 54. 

87. Reference 1, p. 54. 

88. Reference 1, p. 55, 57. 

89. Reference 50, p. 278. 

90. Reference 57, p. 182. 

91. Reference 50, p. 280. 

92. Reference 50, p. 279. 

93. Reference 50, p. 280. 

94. Reference 57, p. 182. 

95. Hamilton, F. E. and E. C. Kubie. The IBM 

magnetic drum calculator type 650. Association 
for computing machinery, Journal, 1 (no. 1):13- 
19 (Jan. 1954). 

96. Reference 57, p. 184. 

97. Photo-electric librarian. Electronics, 22:122, 158- 

66 (Sep. 1949). 

98. Although Richards goes on to say: "However, the 

use of codes with more than 4 bits for the pur- 
pose of circuit simplification has not become 
widespread. 1 ' See reference 57, p. 184. 

99. Reference 57, p. 184. 

100* Krider, L. D. Applications of automatic coding 
to small calculators. Eastern joint computer 
conference, Proceedings, 47 :64~67 (Dec. 1954). 

101. Reference 57, p. 185-192. 

102. Reference 57, p. 185-192, passim. 

103. Reference 57, p. 187. 

104. Livesley, R. K. An introduction to automatic 

digital computers; Cambridge, Eng., University 
press, 1957, p. 16. 

105. Reference 104, p. 16. 

106. Reference 104, p. 16. 

107. Reference 57, p. 5. 

108. Reference 57, p. 5. 

109. Reference 71. 

110. Giffler, Bernard. Simplifying quality control op- 



Retrieval Systems 357 

erations with marginal punch cards. Industrial 
quality control, 11 (no. 9): 20-23 (Je. 1955). 

111. Reference 80, p. 84. 

112. Reference 57, p. 184. 

113. Reference 95. 

114. Reference 57, p. 184. 

115. Reference 1, p. . 71. 

116. Reference 1, p. 72. 

117. Ruston, W. R. Die randlochkarte als hilfsmittel 

fur die wissenschaftliche dokumentation. Nach- 
richten fur dokumentation, 3:5-12 (1952). 

118. Reference 51. 

119. Reference 1, p. 71-78. 

120. Jellinek, E. M. , Vera Efron, and Mark Keller. 

Abstract archive of the alcohol literature. 
Quarterly journal of studies on alcohol, 8:580- 
608 (1947/48). 

121. Reference 51. 

122. Reference 1, p. 77-78. 

123. Guy, A. G. and A. H. Geisler A punch card 

filing system for metallurgical literature. Metal 
progress, 52:993-1000 (Jly. -Dec. 1947). 

124. Moffit, Alexander. Punched card records in ser- 

ials acquisition. College and research libraries, 
7:10-13 (1946). 

125. Wise^ Carl S. A punched-card file based on word 

coding. In: Casey, Robert S. and James W. 
Perry, Punched cards; New York, Reinhold, 
1951, p. 109. 

126. Cox, Gerald J., C. F. Bailey, and Robert S. 

Casey. Punch cards for a chemical bibliogra- 
phy. Chemical and engineering news, 23:1623- 
26 (1945). 

127. Reference 5. 

128. Ames, Stanley R. and Wilma F. Kujawski. Use 
of punched cards for indexing and classifying 

biochemical literature. Special libraries, 39: 
233-38 (1948). 

129. Reference 126. 

130. Reference 5. 

131. Reference 1, p. 62. 

132. Carr, A. M. Production control in a textile-fin- 

ishing plant. In: Casey, Robert S. and James 
W. Perry, Punched cards; New York, Reinhold, 



358 State of the Library Art 

1951, p. 241. 

133. Reference 1, p. 60. 

134. Dyson, G. Malcolm. Punched-card systems and 

their application to library and technical work; 
I. Some applications of mechanical methods to 
library problems in organic chemistry. Associa- 
tion of special libraries and information bureaux, 
Report of the proceedings, 22:23-36 (1947). 

135. Anderson, Isabella. The application of punch card 

filing in a chemical library. Illinois libraries, 
31:406-11 (1949). 

136. Westbrook, J. H. and L. EL DeWald. A modified 

punch card filing system for metallurgical litera- 
ture. Metal progress, 54:324-27 (Jly. -Dec. 1948). 

137. Reference 125, p. 111. 

138. Reference 51. 

139. Campbell, Kenneth N. and Barbara K. Campbell. 

Punched card code for general organic research. 
Industrial and engineering chemistry, 42:1458- 
60 (1950). 

140. Reference 117. 

141. Hood, T. A. Punch card for the field of metal 

finishing. Metal progress, 56:75-78 (Jly. -Dec. 
1949). 

142. Cox, Gerald J. , Robert S. Casey, and C. F. 

Bailey. Recent developments in Keysort cards. 
Journal of chemical education, 24:65-70 (1947). 

143. The coding capacity of each intersection in a tri- 

angular code may be determined by applying the 
permutation formula (4), where Tt n" is the num- 
ber of rows and "m" is 2; the total number of 
intersections is determined by the combination 
formula (1), with TT n tT as the number of positions 
and T! m" as 2. The total capacity is, there- 
fore, the product of the results of these two 
computations; e. g. , the 6-position, 2-row tri- 
angular code has a capacity of 2 x 15 = 30 items 
(cf. reference 1, p. 76). 

144. Reference 23, p. 8-13. 

145. Reference 23, p. 9,11,13. 

146. Cochr&n, S. W. Recent progress in patent clas- 

sification. Industrial and engineering chemistry, 
40:731-33 (1948). 

147. Gull, Cloyd Dake. A punched card method for the 



Retrieval Systems 359 

bibliography, abstracting, and indexing of chem- 
ical literature. Journal of chemical education, 
23:500-07 (1946). 

148. Nalbandjan, N. Le systeme a cartes perforees 

dans un service de bibliotheque. Les cahiers de 
la documentation, 2:43-47 (1948). 

149. Reference 124. 

150. Samain, Jacques. Une nouvelle methode de selec- 

tion des documents: la m6cano-memoire. 
Hommes & techniques, no. 23/24:59-61 (Dec. 
1946). 

151. Telegraph. Encyclopaedia Britannica (1952 ed. ), 

21:887-93. 

152. Reference 151. 

153. Reference 57, p. 177. 

154. Norris, Wells. How five-channel punched tape 

mechanizes office jobs. American business, 24 
(no. 3):10-12, 36 (Mar. 1954). 

155. Clark, Robert S. Books and reading for the blind; 

London, Library association, 1950, p. 7-14. 

156. Reference 155, p. 10. 

157. Reference 155, p. 8. 

158. Formula (6) indicates that 64 combinations are pos- 

sible; however, Braille did not consider the "no 
dot" arrangement as one of the combinations. 

159. Typed figures translated into computer code. En- 

gineering, 183 (no. 4749):348-49 (Mar. 15, 1957). 

160. Shepard, David H. and Clyde C. Heasly, Jr. Pho- 

toelectric reader feeds business machines. Elec- 
tronics, 28 (no. 5):134-38 (May 1955). 

161. Printer plotter translates computer language. Elec- 

trical engineering, 76 (no. 3):262 (Mar. 1957). 

162. Symbol generator and viewer. Computers and au- 

tomation, 6 (no. 12):7-9 (Dec, 1957). 

163. Soper, Alan K. Some observations on the use of 

punched cards for a personal information file. 
Aslib proceedings, 7:251-58 (1955). 

164. Reference 125. 

165. Mooers, Calvin N. Zatocoding applied to mechan- 

ical organization of knowledge. American docu- 
mentation, 2:20-32 (1951). 

166. Reference 163. 

167. Reference 125. 

168. Reference 163. 



360 State of the Library Art 

169. Mooers, Calvin N. Zatocoding for punched cards; 

Boston, Zator Co., 1950 (Zator technical bul- 
letin, no. 30). 

170. Reference 50. 

171. Orosz, G. and L. Takacs. Some probability prob- 

lems concerning the marking of codes into the 
superimposition field. Journal of documentation, 
12:231-34 (1956). 

172. Reference 169, p. 18-19. 

173. Reference 50, p. 293, 295. 

174. Reference 50, p. 296. 

175. Mooers, Calvin N. Putting probability to work in 

coding punched cards - Zatocoding; Boston, 
Zator Co., 1947? (Zator technical bulletin, no. 
10). 

176. Wise, Carl S. and James W. Perry. Multiple 

coding and the Rapid Selector. American docu- 
mentation, 1:76-83 (1950). 

177. Mooers, Calvin N. Coding, information retrieval, 

and the Rapid Selector. American documentation 
1:225-29 (1950). 

178. Wise, Carl S. Multiple word coding vs. random 

coding for the Rapid Selector. American docu- 
mentation, 3:223-25 (1952). 

179. Reference 177. 

180. Reference 176. 

181. Reference 177. 

182. Reference 178. 

183. Reference 176. 

184. Reference 177. 

185. New filing system developed for special collections. 

Library journal, 73:794-97 (1948). 

186. Brown, William Fuller, Jr., and Glen Oneal, Jr. 

Library searches with punched- card machines. 
Science, 123:722-23 (Jan. - Je. 1956). 

187. Rand Corporation. A million random digits with 

100, 000 normal deviates; Glencoe, I1L , Free 
press, 1955. 

188. Reference 186. 

189. Reference 1, p. 196-98. 

190. Reference 125. 

191. Reference 169, p. 9. 

192. Reference 50, p. 298-99. 

193. Code chemicals for index. Science news letter, 



Retrieval Systems 361 

54:39 (Jly.-Dec. 1948). 

194. Sherman, Jack. The use of four-hole randomly 

punched cards for abstracting publications and 
reports into IBM cards. Special libraries asso- 
ciation, Texas chapter, Bulletin, 8 (no. 4):9-14 
(May 1957). 

195. Schultz, Claire K. Coding literature on punched 

cards; unpublished MLS thesis, Drexel Institute 
of Technology, 1952. 

196. Sebring, M. W. A marginal punched card sys- 

tem for a specialized information collection. 
American documentation, 4:18-22 (1953). 

197. Perry, James W. Superimposed punching of nu- 

merical codes on hand- sorted punch cards. 
American documentation, 2:205-12 (1951). 



Index 



A Arnhym, A. A., 122. 

Artificial alphabets, 76. 

Accession lists, 211, 248. Aspect cards, 23, 25, 31. 
Accounting, 107, Ashthorpe, H. D., 116. 

Acquisition, 11, 107. Aspects, 17. 

Adler, F. H., 81. ASTIA, 76. 

Admiralty Signal Establish- Atomic Energy Research 

ment, Technical library Establishment, Harwell, 

110. England, 116. 

Albrecht, J. C , 255. Atomic Energy Commission 

Alexander, S N , 140. cards, 90. 

Aliform, 21, 23, 41, 64, Autoabstr acting, 286, 287, 

79, 80, 81. 291. 

Alphabetic codes, 16, 17, Autoencoding, 288, 289, 291 . 

338, 339, 340, 341, Automatic micro-film infor- 

342, 343. mation system, 199, 

Alpha- Matr ex machine, 77, 200. 

78, 91, 92. Automatic microimage file, 

American Chemical Society, 236, 237. 

220. Avakian, E. A., 199. 

American Documentation 

Institute (1954), 186 B 

American Findex card, 21 
American Society of Metals, Bacon, Francis, 14. 

37, 121, 230o Bagley, P. R., 238, 239, 

AMFIS, 199, 200. 240. 

Ampex, 241 Bailey, C. F , 17, 180. 

Analysis of book stock, 108 Battelle Institute, 165, 202 
Anderson, I, 340 Batten, W. E., 23, 61, 62, 

Andrews, D. D., 220, 223, 63, 80, 84, 85, 86, 89, 

227, 228. 93, 280. 

Arizona Tool and Die Co., Batten cards, 281, 327. 

36. Batten-Cordonnier system, 

Armed Forces Technical 23. 

Information Agency, 76. Baudot code, 343, 344. 

362 



Retrieval Systems 



363 



Bedford, G* M , 182, 195. 
Bendix G--15--D Computer, 

234, 235, 256. 
Benson-Lehner Corp., 199, 
Berkely, E. C., 146. 
Berry, M. M. , 105, 142, 

143, 230, 233, 
Bibliographic devices, 122, 
Bibliographical techniques, 

11, 12, 13, 27, 29, 

31, 123 

Bibliography (steps of), 293. 
Binary code, 14, 336. 
Bjorkbom, Carl, 69. 
Bloomfield, M., 13, 247, 

248. 
Boekeler Instrument Co, 

36. 

Book catalogs, 124. 
Borgeaud, 23, 59, 63. 
Braband, C., 80. 
Bracken, R. H., 246. 
Braille alphabet, 344. 
Brisch and Partners, Ltd., 

40. 
Brisch, Inc.,, 1070 Union 

Commerce Building, 

Cleveland, Ohio, 70 
Brisch-Vistem, 40, 70 
British Patent Office, 62, 
Brooklyn College Library, 

107. 

Brown, W. F., 348. 
Bunshodo, Tokyo, 43. 
Burchard, J. E., 113. 
Burroughs Corporation- 

Todd Company Division, 

36. 
Bush, V., 112, 172, 173, 

175, 199 



Callander, T. E., 106. 



Campbell, K. N., 241. 
Canova, M. F., 248. 
Carter -Parr att cards, 71, 

72. 
Carter -Parr att, Ltd. , 

Iddesleigh House, 

Caxton St., London 

S.W. 1, England, 40, 

70, 88, 89. 

Casey, R. S., 17, 348. 
Cataloging, 11, 29, 110. 
Census Bureau 488 Multi- 

Column Sorter, 221. 
Centre National des Re- 

cherches Scientifiques, 

65, 186. 
Challons, 110. 
Chain spelling, 217. 
Chapin, N., 146. 
Charles R. Hadley Co., 36. 
Chronological indexing, 79* 
Ciba, 202. 

Circulation, 106, 107. 
Clapp, E A., 121. 
CNRS S 65, 186 
Coblans, H , 186. 
Cochran, S W , 341 . 
Codes, 140,157,158,159,217, 

221, 321, 329, 332, 334, 

339, 340, 342, 344, 346, 
Code field, 15, 19, 77. 
Code position, 15 <> 
Code section, 15, 16, 19. 
Coding capacity, 19. 
Coding, direct, 15, 20, 21, 

26, 322, 327. 
Coding, superimposed, 18, 

20, 21, 30, 346, 348. 
"Coincidentally" punched 

cards, 57. 

Collating systems, 118. 
Colon Classification, 62,85. 
Columbia River Regional 

Library, 125. 



364 State of the Library Art 

COMAC, 166, 167. 150, 155, 157, 170. 

Combination code, 16, 20, Detectri, 22, 41, 63, 64. 

21. Deutsche Bucherei, 79. 

Compagnie des Fichiers Deutsches Kunstoff-Institut, 

Modernes, 40. 80. 

Concordances, 282. De Wald, L. H., 340. 

Continuous Multiple Access Dewey, H., 122* 

Collator, 166, 167, Dictionary catalog, 11. 

Control holes, 19. Digital computer, 238, 239, 
Coordinate indexing, 25, 269. 

36, 68, 147, 148, 149, Digital computer components 

248. 145. 

Copeland -Chatter son Documentation center, 114, 

Company, Ltd., 40* Documentation, Inc., 
Copeland-Chatterson single Washington, D. C., 36, 

hole punch, 12 72. 

Copyflex, 27. Donnay, J.D.H., 60, 84. 

Cordonnier, 23, 61, 63, Dow Chemical Co., 265, 

64, 65, 66, 77, 84, 270. 

86 5 87, 91, 94 Draheim, H., 337, 341. 

Cordonnier card, 86. Dunlop Rubber Co., 12. 

Cordonnier system, 327, Dunn, H. L., 331. 

Cost analysis, 28. Duplimat stencils, 110. 

Cox, G., 17, 339, 341, (E.I.) duPont de Nemours 

348. Co., 202, 204, 234. 

Crandall, G. S., 210* Dyson, G,, 340. 

Cross-references, 11. Dysonian system, 321, 340* 
Crosz, G., 20, 
Current list of medical E 

literature, 123. 

Eastman, 78* 

D Eastman Kodak, 192, 196, 

197. 
Datatron Electronic Comput-Eckert, W. J., 332. 

er, 274. Edge-punched cards, 13, 
Decision points, 153. 14, 15, 26, 30, 36, 37, 

"Deep punch tT , 19. 43, 63, 69, 321, 322, 

Delta card, 23, 43, 66, 67, 330. 

68," 69, 88. Edler & Krische, Kestner- 
Delmas, J., 114. strasse 42 5 Hannover, 

Dequeker, 22, 42, 79, 80. 

Dequeker, S. A. 5 40. Egan, M., 196. 

Derbol-owsky, U., 81. Ekaha, 23, 41, 69, 79, 80. 

Descriptors, 11, 13, 18, Electrodata, 274. 



Retrieval Systems 



365 



Electronic Spectroanalyzer, 

269. 
Electronic Structure 

Correlator, 277, 278. 
Engineering Research 

Associates, 173, 175, 

178 

Enjay Laboratories, 268. 
Esselte, Stockholm, 43. 
E-Z Sort, 19, 37. 



Fac-Tronic, 253. 
Fairbanks, E. E., 61, 84. 
Fairthorne, R. A., 113. 
False drop, 156, 212. 
Feature cards, 57, 58, 63, 

65, 67, 68, 69, 70, 

71, 72, 73, 74, 75, 

76, 77, 78, 79, 81, 

82, 84, 85, 88, 89, 

92, 93 

Feature field, 72. 
Federal Telecommunication 

Laboratories, 269. 
Feldlochkarten, 21. 
Feldman, 193. 
Fiches super po sables, 23, 

57. 
Filmorex, 140, 182, 184, 

185, 186 W 

Film f n File, 111. 
Filmsort, 26 
Findex, 39, 69. 
Fingerprint identification, 

86. 

Firth, F. E., 260, 261. 
Flagg, C., 231, 232. 
FlexLsort, 38. 
Flexowriter, 190, 209, 

232, 234. 
Food and Agriculture 

Organization, Fisheries 



Biology Branch, 65. 

Frazier Precision Instru- 
ment Co., 37. 

Frome, J., 227. 



Gagarin, R., 24. 
Gaikoku Bunken-Sha, 

Tokyo, 43 . 
Garfield, E., 117. 
Gaylord Bros., Inc., 37. 
General Electric Co., 249. 
Geological Society of South 

Africa, 59. 
Giffler, B., 336. 
Gilbert, P. T., 18. 
Gmelin Institute, 24, 33, 

115. 

Goldberg, E., 172. 
Gordon-Kendall -Davison 

notation, 277. 
Gray, C. J., 23, 59, 84. 
Grid, 22. 
Grobe, G., 14. 
Gull, D., 25, 105, 341. 

H 

Hand-manipulated punched 

cards, 32, 33. 
Hardy, F.E.M., 20. 
Hartree, D. R., 332. 
Hawken, W. R., 193. 
Haykin, 14. 
Hayne, R. L., 119. 
HAYSTAQ, 240, 241, 243, 

244. 
Heinze, H., 81, 82, 85, 

92, 93. 
Herner and Company special 

computer, 255. 
Hildebrand, B., 120. 
Hit relay, 224. 



366 



State of the Library Art 



Hochschule fur Elektro- 
technik in Ilmenau, 
79. 

Hollerith cards, 62, 67, 
80, 81, 92, 162, 195 a 

Hollerith code, 159, 

214, 217. 

Hollerith searching, 

163. 

Hollerith, 12. 

Holmstrom, J., 22, 57, 
59, 61, 62, 64, 65, 
66, 72, 81, 84, 85, 
87, 88, 89, 92, 93. 

Hood, T. A., 341. 

Hurlbut, C. S., 60, 84, 
90. 

Hyslop, 28. 



IBM cards, 23, 32, 67, 
68, 73, 127, 208, 275. 

IBM Department of 

Education catalog, 110. 

IBM, Inc., 37. 

IBM machines, 123, 206. 

IBM Magnetic Drum Calcu- 
lator 650, 334, 337. 

IBM Military Products 
Division, 218. 

EBM-9900, 169, 170. 

3BM-101, 118, 119, 145, 
146, 159, 166, 202, 
203, 209, 214, 215, 
219, 221, 224, 268, 
295. 

Photoelectric scanner, 

163, 165, 166. 

IBM Research Center, 
Information Retrieval 
Research DepU, 21 8 e 

IBM Selective Sequence 
Electronic Calculator, 



332 

IBM-700 Series, 245. 
IBM-701, 245, 247. 
IBM-702, 271* 
IBM-704 Electronic Data 

Processing Machine, 

144, 214, 218, 219, 

245, 248, 249, 250, 

265, 269, 287. 
IBM 705-Electronic Data 

Processing Machine, 

196, 250, 251, 252, 

285 
IBM 305-RAMAC, 250, 259, 

260, 261, 262, 263, 

264. 

IBM X-794, 166. 
Identification Division of the 

Federal Bureau of 

Investigation, 86. 
ILAS, 143, 144, 146, 220, 

221, 224, 225, 227, 

228, 229, 234. 
Illinois E-Z Sort anaesthesia 

record card, 37. 
Index headings, 154, 157. 
Indexing systems, 147. 
Industrial studies and in- 
vestigations, 70, 
Ins tit ut des Fruits et 

Agrumes Coloniaux, 65, 

69, 84, 86, 87, 94. 
Institute of Cancer Research 

12. 

Integral, Dusseldorf, 42 
"Integrity" of catalog, 127. 
Interfix, 223, 225, 226, 

228. 
International Business 

Machines, see IBM 
International Conference on 
Scientific Information 



Retrieval Systems 



367 



(1958), 33, 171, 214, 
216 e 

International Federation of 
Documentation, 14. 

International Telemeter 
Corp,, 197, 198* 

Interrelated Logic Accu- 
mulating Scanner, 
143, 144, 146, 220, 
221, 224, 225, 227, 
228, 229, 234. 

Item cards, 72. 



Jacquard, J M, 12. 
Jaeckle, J. F., 79. 
Jolley, J. L., 57, 70, 71, 

72, 75, 89, 90, 92. 
Jones, E. G., 111. 
Jonker Business Machines, 

23, 37, 91. 
Jonker, F., 91. 

K 

Kalfax film, 198. 

Kent, A., 46, 142, 143, 

196, 202, 230, 231, 

232, 233. 
Kent, Allen & James W. 

Perry - Centralized 

Information Services, 

45. 

Key sort, 38. 
King County Library 

(Washington State), 124, 
Kistermann, F , 66, 81, 

84. 

Knappe, W., 80, 92. 
Krider, L. D., 334. 



Lam pel, B., 84. 

Langan, 111. 

Leary, 120. 

Ledley, R. S., 279 a 

Le Febure Corporation, 38. 

Legal research processes, 

121. 

Leibowitz, J., 227. 
Level of retrieval, 93 
Levinson, 120. 
Lewis, Co Mo, 192, 200. 
Liber, 23 . 
Liber, Henry, 59. 
Library of Congress 

catalog, 177. 
Lichtpausverfahren, 26 
Listomatic camera, 123 
Livesley, R. K., 335. 
Livingstone, G. A., 120. 
Lochkartenwerk Schlitz, 42. 
Logistics Research, Inc., 

197. 
London Telephone Directory, 

20. 
Loosjes, T. P., 67, 68, 

69, 84, 85, 88. 
Los Angeles County Public 

Library, 124. 
Lowry, W. K., 255, 295. 
Lowry and Albrecht special 

computer, 255, 257. 
Luhn, H. P., 114, 119, 

163, 165, 166, 214, 

216, 286, 288, 290, 

291. 
Luhn Scanner, 143, 163, 

165, 166, 167, 216. 
"Luko" series, 164, 

M 



McBee cards, 26 . 



368 



State of the Library Art 



McBee Corporation, 13, 27, 

38. 

McBee Selector, 30, 
McCafferty, J., 119, 
McGaw, H. F., 337. 
MacKinnon, F. B., 120. 
MacQuarrie, Co, 124, 
Machine searching, 81, 87, 

94, 113, 115, 116, 

160. 
Machine- sorted cards, 113, 

115, 116, 125, 126, 

322, 327, 330, 341. 
Magnavox Film Data 

Recorder, 200. 
Marginal punched cards, 

13, 14, 15, 26, 30, 

36, 37, 43, 321, 322, 

330. 
Massachusetts Institute of 

Technology, 165. 
Matrex, 37, 88, 
Matrix film, 75, 236. 
Mayor, Y., 113. 
Melon, J., 60. 
Melton, J., 231, 232. 
Mem ex, 199. 
Merck, Sharp and Dohme, 

202, 206. 
Metallurgical literature 

card, 37. 
Microcite, 24, 31, 57, 66, 

75, 90, 
Microdoc, 66. 
Microfilm, 30, 182 
Midwest Research Institute, 

269. 
Milwaukee Public Library, 

109, 111. 
Minicard, 140, 188, 189, 

190, 191, 192, 193, 

194, 195, 196. 
Mitchell, M ft F., 254. 
Modulant, 222, 225. 



Moffit, A M 109, 126, 341, 
Molina, E. C., 12. 
Monotype, 287 C 
Monsanto Chemical Co., 

271 C 

Montclair (N.J.) Public 
Library, 106, 108 
Mooers, C. N., 18, 39, 

139, 178, 346, 347, 

348, 349 

Morse code, 14, 342, 343 
Moyer, S. R., 251, 252. 

N 

Nalbandjan, N., 341. 
National Bureau of 

Standards, 22, 23, 37, 

39, 73, 240. 
National Bureau of 

Standards Microimage 

Selector, 142, 
National Library of 

Medicine, 122. 
National Luchtvaartlabora- 

torium in Amsterdam, 

26. 
National Science Foundation, 

47. 

Needle sort, 36. 
Needling, 15, 16, 20, 21, 

22, 29. 

New Serials Titles, 123. 
New York State Library, 

125. 

Noise-effect, 17, 18, 156. 
Nolan, J. J., 260, 261, 

262. 

Norton, J. R., 270. 
Notched cards, 12. 
Notions, 288, 289. 
Numerical coding, 330. 
Numerical sequence code, 

16, 20, 21. 



Retrieval Systems 



369 



O 



Oak Ridge Laboratory of 
the Atomic Energy 
Commission, 19, 37. 

O'Connor, J. J., 254, 
280* 

Offenhauser, W. H., 192, 
200. 

Office of Basic Instru- 
mentation, 24, 72, 74, 
75, 76, 87, 90, 91, 
94. 

Office of Naval Research, 
194. 

Office of Technical 
Services, 173. 

OLECB plan, 339. 

Onderwer pponskaarten, 57 . 

Oneal, G., 348. 

Opler, A., 267, 270. 

"Optical" cards, 57. 

Optical Coincidence Subject 
Cards, 22* 

Optical searching, 87. 

Orthographic single -field 
superimposed code, 18, 

Ozalid process, 26. 



Paint and Varnish literature 

card, 37. 
Paramount punched card 

system, 40. 
Parker, R. H., 105, 108, 

109, 110, 111, 112. 
Payroll, 11 1 
Peakes, G. L., 118, 
Peek-a-Boo, 23, 37, 38, 

39, 40, 41, 42, 43, 

57, 250, 263, 327 
"Peekable" cards, 57. 
"Peephole", 57. 



Perkins, Alfred, 12, 13. 
Perry, J a W., 28> 110, 

142, 143, 165, 177, 

196, 220, 222, 230, 

233, 234, 238, 239, 

240, 349. 
Per Selecto, 64. 
Personnel records, 111. 
Philips 1 Gloeilampen- 

fabricken, Patent 

Division, 67. 
Photoelectric scanning, 

114, 119. 

Proctor and Gamble, 202. 
Pietsch, E.H.E., 115. 
Pike, R. H., 111. 
Plas-Ta Card, 39 a 
Powers cards, 80, 111. 
Preddek, R., 23. 
Program, 140. 
Property cards, 73. 
Punched card sorter 

components, 145. 
Punched cards, 12, 13, 

14, 15, 21. 
Punched cards with visual 

punching, 22. 
Punching methods, 20. 
Putnam, M. E , 36. 

Q 

Quigley, M., 108. 
R 

Rabinow, J., 175. 
RAMAC, 250, 259, 260, 

261, 262, 263, 264. 
Rand Corporation, 341, 

348. 

Random selection, 18. 
Ranganathan, 62. 
Rapid Selector, 140, 142, 



370 



State of the Library Art 



163, 166, 172, 173, 
174, 176, 177, 178, 
179, 180, 182, 183, 
195, 198, 334, 348. 

Rapidtri, 40. 

Reagh, R. R., 331. 

Recall Incorporated, 198. 

Remington Rand, 38, 206. 

Retrieval, 20, 93. 

Retrieval aspects, 13 

Richards, R. K., 331, 
335, 336, 337, 343. 

"Rocket" card, 36 8 

Roget T s Thesaurus, 289. 

Role, 119. 

Rotschuh, K. E., 24, 80. 

Royal McBee Corporation, 
38. 

Royer, E. B., 332. 

Row -by-row Scanning 

Punched Card Sorter, 
144, 146, 214, 215. 

Rowe, H. T M 286. 

Ruston, W. R., 337, 341. 



Samain, J., 162, 163, 182, 

185, 220, 342. 
Samas, 341. 
Samo, Milan, 43. 
"Satellite catalogs", 78, 

211. 

Savage, T. R., 286, 287. 
Schering, 202. 
Schlitzlochkarten, 21. 
Schreibende Randlochkarte, 

27. 

Schultz, C. K., 207. 
Searching, 160, 293. 
Sebring, M. W., 349. 
Selecteur, 40. 
Selection, 11. 
Selecto, 23, 41, 64, 69, 



80, 81, 86. 

Selector, 21, 163. 

Selector, Dequeker, 22. 

Selector-code, 17, 20. 

Selectri, 22, 41. 

Selez, 43. 

Semantic factors, 230, 231 . 

Semper Avanti, 43, 66, 68, 
(The Hague). 

Serials acquisition and 
control, 109. 

Shaw, Ralph R., 32, 115, 
163, 165, 166, 172, 
173, 175, 177, 179, 
182, 183, 185, 194, 
195, 233, 293. 

Self -listing, 111. 

Sichtlochkarten, 23, 57. 

Signal, 222, 225, 226. 

Sloan-Kettering Institute, 
269. 

Slotted cards, 21, 43. 

Smith, J., 75. 

Smith, Kline and French, 
202, 207. 

Societe Detectri, 41. 

Societe Microdoc, 41, 64. 

Societe Selection, 41 . 

Socony Mobil, 202, 209 

Soper,H.E , 23, 58, 63, 83, 84, 

Sorbonne Mineralogical 
Laboratory, 64. 

Sorto, 43. 

Special Library Association, 
37. 

S . L. A. Metallurgical 

Literature Classifica- 
tion, 121. 

Spectrophotometer, 269. 

"Sphinxo", 23, 41, 63, 64, 
67. 

Sta Selecto, 64. 

Standards Electronic Auto- 
matic Computer (SEAC) 



Retrieval Systems 



371 



241, 242. 

Stamford, H. P., 12 e 
Statitex, 41. 
Steinhardt, L R , 183. 
Stern, J., 22, 24, 74, 75 . 
Stibitz, 332. 
Stockton (Calif.) Public 

Library, 107. 
Stoetzer, W., 114. 
Stored Function Calculator, 

199. 

Stretch, W. M., 12. 
Stroem, I., 14. 
Stubenrecht, A., 27 . 
Stumpf, P. Q., 210. 
Subject cards, 22, 23, 25. 
Subject card system, 22, 

23. 
"Super -impo sable" cards, 

57. 
"Super imposition noise", 

77 
Superior Business Machines, 

Inc., 3 80 

Synoptic filing, 68, 88. 
Synoptic system, 84. 
System environment, 161. 



Tabled ex, 280. 

Taube group, 76, 77, 78, 

82, 83, 85, 86, 87, 

88, 91, 93, 94. 
Taube, M., 25, 110, 166, 

167, 170, 171, 194. 
Taylor, H., 23, 57, 63, 

75, 83. 
Technical Information 

Service of the Atomic 

Energy Commission, 

73. 

Technical services, 11. 
Telegraphic abstract, 230, 



231, 232, 234. 
Termatrex, 23, 91. 
Thermophysical Properties 

Research Center, 274. 
Thesaurus, 289. 
Thome, R. G; 28. 
Tillitt, H. E., 246. 
Titthalkoft, 57. 
Transelecta, 79* 
Triangle code, 17, 19. 
Turim, F. ? 119. 
Type I device, 142, 145. 
Type H device, 142, 145. 
Type in device, 143. 
Type IV device, 143. 
Type V device, 143. 

U 

Uhlein, E., 81, 84. 

Underwood Corporation, 
Samas Div., 39 

Union Carbide & Chemical, 
202, 212. 

Unipr inter, 253. 

Uniterm cards, 78. 

Uniterm system, 25, 36, 
68, 85, 88, 245, 248, 
250, 263. 

Unityper, 253. 

Universal card scanner, 
216, 218, 219. 

Univac Fac- Tronic system, 
253. 

Univac, 253, 280, 281. 

Universal Decimal Classifi- 
cation, 14. 

University of Florida, 106. 

University of Missouri, 106. 

University of Pennsylvania 
(Institute for Coopera- 
tive Research), 250. 

University of Wisconsin, 
106. 



372 



State of the Library Art 



Uses of machine -sorted 

punched cards, 103, 

104, 106, 107. 
U. S. Air Force Office of 

Scientific Research, 

169. 
U. S. Atomic Energy 

Commission, 72, 175. 
U. S. Bureau of Standards, 

181. 

U. S. Census Bureau, 12. 
U. S. Department of 

Agriculture, 142. 
U. S. Department of 

Commerce, 173. 
U. S. Library of 

Congress, 33, 107, 

111. 
U S. Library of Congress, 

Technical Information 

Division, 26. 
U. S. Naval Ordnance 

Station (China Lake, 

Calif.), 245. 
U. S. Patent Office, 202, 

220, 228, 229, 234^ 

240, 256, 263. 
Universal Decimal Classifi- 

cation, 117. 



Wainwright, L., 146. 
Wassell Organization, Inc., 

39. 

Watertown Arsenal, 119,, 
Waugh, D., 109* 
Weigelin, E., 14. 
Weil, B 8 H., 120, 121, 
Welch Medical Library 

Indexing Project, 166, 

202. 

"Weighted" code, 331. 
Welt, I. D., 120. 
Westbrook, J. H., 340. 
Westendorp, J., 67, 87. 
Western Reserve University, 

230. 

Whaley, F. R., 118, 119. 
Whirlwind I, 238. 
Wight, E. A., 109. 
Wildhack, W. A., 22, 74, 

75, 77, 90. 
William K. Walters 

[Company], 39 
Wise, C. S., 177, 178, 333, 

346, 347, 348, 349. 
"Word coding", 346 
WRU Searching Selector, 

232, 233. 



Variables, 153. 

VEB Organizationsmittel- 

Verlag, 42, 79 
Vickery, B C , 182. 
VsesoiuznaiaGosudarstven- 

naia Biblioteka 

Inostrannoi Literatury 

in Moscow, 79. 

W 



X-Ray Sort Card, 38. 
Xerography, 27. 



Yale University, 178, 179, 

181. 
Yes-No Code, 321, 327, 

333, 338. 



Z 



Wachtel, L S.,61, 84, 90, Zatocoding, 18, 39, 346, 



Retrieval Systems 373 

347, 348. 
Zator Co., 39* 
Zator Selector, 39. 





120 02 



II 



